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CHAPTER  1 

DATA  COMPRESSION  AND  MAXIMIZATION  OF  INFORMATION  CONTENT 


1.1  INTRODUCTION 

Although  this  document  primarily  addresses  problems  at  the  various 
test  ranges,  it  should  provide  applications  for  use  throughout  the  scientific 
community.  Much  of  the  reference  material  used  was  derived  from  publications 
by  other  government  agencies,  contractors,  universities,  and  private  industry. 

Within  the  past  several  years  nearly  all  test  ranges  have  experienced 
an  exponential  growth  in  the  quantity  of  data  being  recorded  and  processed. 
Almost  concurrently,  large-scale,  real-time  data  processing  systems  have 
been  developed.  It  is  reasonable  to  expect  this  trend  to  continue  in  the 
years  to  follow.  Consequently,  there  Is  an  increasing  demand  for  minimising 
redundancy  and  compressing  the  masses  of  data  into  forms  which  may  be  more 
quickly  and  easily  assimilated. 

Much  effort  has  been  expended  at  various  locations  in  the  development 
of  reliable  means  for  transmitting  and  retaining  only  significant  changes 
In  data  instead  of  processing  all  that  is  generated.  Considerable  attention 
has  also  been  given  to  presenting  this  data  in  forms  conducive  to  early 
decision  making.  All  this  effort  has  produced  a number  of  techniques  which 
make  up  the  field  of  data  compression  and  maximization  of  information 
content. 


1.2  SCOPE 


This  publication  is  intended  as  a single  source  document  describing 
available  techniques  for  reducing  the  quantity  of  data  processed  and  for 
providing  meaningful  presentation.  It.  Includes  mathematical,  statistical; 
and  graphical  techniques  which  have  been  used  successfully. 
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1-3  DATA  COMPRESSION  DEFINITION 


The  expression  "data  compression”  has  broad  meaning  and  say  encompass 
-.my  or  all  of  the  following:  data  compaction,  bandwidth  cospression, 

redundancy  removal,  redundancy  reduction,  adaptive  saspling,  parameter 
extraction,  optimal  estimation,  and  possibly  a few  other  techniques.  In 
general,  data  compression  denotes  operations  which  are  performed  to  reduce 
the  quantity  of  data  prior  to  transmission,  but  which  still  preserve  the 
minimum  data  elements  of  a measurement  continuum  such  that  the  original 
information  may  be  reconstructed  within  established  limits  of  error. 
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CHAPTER  2 

REDUNDANT  DATA  REMOVAL/USEFUL  DATA  SELECTION 


2. L INTRODUCTION 

Data  redundancy  has  been  defined  as  "that  fraction  of  a message  or 

datua  which  is  unnecessary  and  hence  repet ir.ive  in  the  sense  that  if  it 

were  missing  the  aessage  would  still  be  essentially  complete,  or  at  least 

could  be  completed.  Redundancy  exists  whenever  the  sampling  rate...  exceeds 

the  frequency  required  to  describe  the  input  function  in  accordance  with 

f2— 61 

the  accuracy  requirements  of  the  user."1  1 The  methods  for  retaining 
data  which  provide  essentially  all  the  information  contained  in  the  original 
aessage  range  from  some  simple  visual  and  manual  techniques  to  complex 
computer  driven  algorithms.  However,  the  basis  of  all  removal/selection 
techniques  is  the  examination  of  each  data  sample  and  performing  a compari- 
son te-  preceding  or  succeeding  samples  in  the  context  of  some  arbitrary 

f2_gi 

reference  pattern. * 1 The  choice  of  methods  is  extensive  and  may  be 

adapted  to  virtually  3ny  set  of  circumstances  or  data. 

2.2  EDITlnC 


The  editing  processes  involve  the  identification  and  subsequent  removal 
of  data  estimates  which  are  considered  either  erroneous  or  non-essential  to 
the  information  content.  Additionally,  if  erroneous  data  during  "critical" 
intervals  are  edited  those  samples  must  sometimes  be  replaced  by  prediction/ 
Interpolation  techniques.  The  implementation  of  these  processes  depends  or. 
the  purposes  and  uses  of  the  data,  especially  in  the  context  that  raw  data 
are  generally  comprised  of  both  expected  and  abnormal,  or  unexpected,  input 
samples.  The  flight  engineer  will  consider  useful  chat  data  which  shows 
missile  performance  characteristics  only,  whereas  the  instrumentation 
engineering  will  be  interested  in  data  showing  instrumentation  failures. 

In  each  case  data  compression  may  be  accomplished  through  the  reduction 
methods  of  editing. 
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2.2.1 


MANUAL  TECHNIQUES 


Although  generally  acre  laborious  and  vulnerable  to  a 
certain  amount  of  subjective  judgement,  manual  editing  is  often  employed  to 
reduce  the  data  volume  and  select  useful  data,  especially  during  preliminary 
data  processing  stages.  Two  common  data  forests  are  lists  and  plots,  and 
in  each  case  techniques  may  be  employed  to  facilitate  the  reduction  process. 

A simple  method  of  editing  printed  data  is  to  arrange  it  in 
a columnar  format  and  sort  it  with  respect  to  some  key  parameter,  usually, 
but  not  necessarily,  time.  The  reorganization  of  data  in  this  way  improves 
the  capability  to  show  data  discontinuities  and  duplicated  samples,  which 
an  be  identified  for  removal.  When  samples  are  arranged  in  vertical 
■uxtaposition  with  respect  to  previous  and  subsequent  samples  of  the  same 
t. met  ion,  simple  trend  analyses  sav  be  accomplished.  These  include  such 
non-parasetric  tests  as  determination  of  zero  crossings  and  the  relative 
sizes  (value)  of  the  data  estimates.  In  order  to  facilitate  the  editing  of 
iita  lists  the  data  parameters  may  be  reconfigured  by  computing  simple 
first  or  second  differences  between  samples  which  will  detrend  the  data  and 
amplify  data  anomalies: 


iXi  = Xi  ' xj.l 


A*V  . 

~ xi 


= 


axi-l  38  xi  - 2xi-l  * xi-2 


(2.2-1 ) 
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The  data  nay  be  reconfigured  into  estimates  of  variances  over  short  intervals 

which  may  be  reviewed  to  determine  quickly  where  data  samples  may  be  edited. 

These  methods  of  reconfiguration  may  be  combined  to  provide  detrended 

variances  in  data  which  are  changing  in  a polynomial  fashion;  the  variances 

£2—21 

may  be  estimated  from  the  differenced  data, 


n _ 

E <Anx)2 


(m-n)nTnT 


(2.2-2) 


where  o 

x 


random  error  in  the  x coordinate 


the  nth  successive  difference  in  x 


ra  = the  number  of  points  used  in  each  sample 

Whan  using  this  technique,  n > 3 in  order  to  at  least  eliminate  quadratic 
trends. 

The  same  general  methods  are  used  in  analyzing  and  editing 
data  in  plotted  formats.  Trend  analyses  may  be  accomplished  more  readily 
because  of  the  ability  to  review  the  data  in  a more  condensed  form. 

Sampling  rates  may  be  determined  and  useful  data  spans  identified.  Discon- 
tinuities in  the  data  trend  and  spurious  samples  may  be  discerned,  and  in 
reviewing  data  in  plotted  fora  it  is  possible  to  determine  patterns  of 
abnormal  data  occurances.  A drawback  in  editing  the  data  from  the  graphical 
representation  is  the  loss  of  a certain  amount  of  data  resolution,  depending 
upon  presentation  scale  factors. 

2.2.2  BOUNDARY  LIMIT  EDITING 

The  simplest  computer  editing  methods  employ  a selection 
process  which  compares  the  present  data  value  to  preset  upper  and  lower 
limits: 

kj  < Xf  < k2,  (2.2-3) 
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where  the  k's  may  be  constant  or  even  some  function  of  (x).  Under  one 
option,  acceptance  occurs  when  the  condition  above  is  true;  under  the 
other  if  the  statement  is  false.  In  either  case  the  boundary  limit  test  is 
designed  to  eliminate  data  which  is  considered  unimportant. ^ ^ Because 
the  values  of  the  k's  must  be  predetermined,  a priori  knowledge  of  the 
nature  of  the  data  must  be  considered  in  planning  for  this  type  of  editing. 

Depending  on  the  environmental,  instrumentation,  or  processing 

characteristics  which  affect  the  nature  of  the  data,  the  following  factors 
[2-81 

may  be  utilized. 

a.  Time  Constraints.  Data  may  be  recorded  only  when  it 

vs  within  the  time  period  covering  a specified  maneuver  for  a particular 

test.  The  most  common  procedure  is  simply  to  turn  off  the  recorders,  or 

• dit  recorded  data  using  times  found  on  operational  notes  to  avoid  processing 

data  considered  meaningless.  Additionally,  data  sampling  or  compression 

nay  be  initiated  or  discontinued  on  the  basis  of  other  events,  which  are 
. ,,  12— X] 

measured  or  recorded. 

fc.  Physical  Bounds.  Variable  which  exceed  known  physical 
limitations,  e.g.,  velocities  of  Mach  10,  aircraft  altitudes  over  a million 
feet,  etc.,  need  not  ue  accepted  or  processed. 

c.  Calibration  Limits.  Telemetry  functions,  especially, 
r.l  )se  which  exceed  calibration  limits  will  probably  be  outside  the  desired 
testing  range. 

d.  Computer  Table  Limits.  Editing  criteria  may  be  based 
on  the  amount  of  available  computer  core  or  tape  storage  whenever  the  data 
samples  retained  will  meet  conditions  sufficiently  to  describe  the  entire 
population.  This  method  of  editing  is  usually  employed  if  further  data 
compression  methods,  viz,  regression  analysis  or  analysis  of  variance,  will 
be  used. 


e.  Detection  Threshold  Limits.  The  signal- to-noise  level 
of  all  functions  may  be  monitored  to  determine  if  any  data  are  in  fact 
being  received. 

f.  Historical  Limits.  Based  on  the  results  of  previous 
similar  tests,  expectation  bounds  may  be  determined  to  edit  subsequent 
tests.  The  historical  limits  will  usually  be  finer  than  the  physical 
bounds. 


g.  Statistical  Limits.  Estimates  of  variance  may  be 
computed  over  short  intervals  and  used  to  remove  erroneous  or  meaningless 
data,  or  to  sense  signficant  changes.  The  variation  of  the  general  test  in 
this  case  would  be 


~ksx  <x±<  ksx,  (2.2-4) 

where  s is  the  estimate  of  the  idard  deviation  in  x. 

h.  Trend  Limits.  jed  on  the  change  in  the  trend  of  the 
data,  a data  sample,  x^,  may  be  eliminated  if 

|*i  - Xj.jl  > k (2.2-5) 

where  the  boundary  k is  known  a priori. 

2.2.3  SOURCE  SELECTION 


When  there  are  simultaneous  measurements  of  a parameter  by 

more  than  one  instrumentation  system,  redundancy  exists  and  a best  estimate 

[2-121 

of  that  parameter  may  be  made  and  all  other  measurements  discarded. 

Source  selection  may  be  accomplished  in  two  ways. 

a.  Determination  of  the  best  source.  A preliminary  step 
here  is  to  eliminate  all  data  showing  apparent  malfunctions.  This  may  be 
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accomplished  by  using  the  various  methods  of  limit  checking.  Variance 
estimates  may  then  be  computed  from  each  set  of  data  and  utilized  as  weight- 
ing criteria  to  determine  the  best  data  set, 


W. 

3 


(2.2-6) 


In  practical  application,  changes  in  source  selection  should  be  made  only 
when  the  weights  change  significantly. 


b.  Computation  of  a combined  best  source.  The  relative 
weights  previously  calculated  in  the  determination  of  the  best  source  may 
be  used  to  compute  a set  of  data  which  is  a combination  of  all  sources  rx d 
which  provides  more  confidence  than  use  of  any  one  set  alone.  This  best 
estimate  may  be  used  as  the  data  source  during  further  processing.  The 
pooled  estimate  may  be  computed:^  ^ 


X1  * 


(2.2-7) 


l 

j*1  J 


2.3  SAMPLING 


Redundancy  removal  through  sampling  is  a direct  data  compression 

t hod  which  operates  on  the  data  in  such  a way  that  the  output  values  are 

che  actual  sample  values  of  the  input  data,  or  the  actual  sample  values 

[2-1] 

within  a tolerance.  These  data  compression  techniques  can  be  divided 

into  two  classes;  those  which  essentially  destroy  the  time  reference  and 

transmit  the  significant  samples  at  a constant  rate,  and  those  which  transmit 

[2—51 

only  significant  samples  as  they  occur  in  time.  The  first  method  is 

termed  fixed  rate  compression,  the  other,  variable  rate  compression. 

When  using  fixed  rate  sampling  it  is  assumed  that  the  data  character- 
istics are  constant  and  some  optimum  rate  may  be  determined  & priori.  This 
is  generally  based  on  the  highest  frequency  expected  in  the  data.  Since 


the  data  sampling  rate  is  known,  the  time  tag  need  not  be  carried  along  but 
cay  be  reconstructed  after  the  essential  processing  is  complete. 

Variable  rate  compressors,  on  the  other  hand,  have  greater  potential 
for  redundancy  reduction  because  the  output  saapling  rates  are  keyed  to 
variations  in  the  data  characteristics.  However,  because  of  this  flexi- 
bility, each  data  sample  requires  a tine  tag,  and  in  some  cases,  when 

combined  with  other  compressed  data,  may  result  in  more  data  bits  being 

[2—11 

transmitted  than  were  in  the  original  data. 

2.3.1  FIXED  RATE  COMPRESSION 


The  technique  most  commonly  used  is  simply  to  sample  the 
data  at  a rate  close  to  the  Nyquist  or  folding  frequency,  (f^),  which  is 
the  maximum  frequency  which  can  be  resolved  for  a given  saapling  rate,  At. 
Generally  most  sensors  are  sampled  at  sure  than  the  theoretical  minimum  of 
twice  the  highest  frequency  component: 


1 

JKt 


(2.3-1) 


and  frequencies  (f)  which  could  be  resolved  are: 


(2.3-2) 


If  significant  frequencies  which  are  higher  than  Hertz 
exist  in  the  data  they  will  appear  as  lower  frequencies  between  0 and 

Hz.  This  is  called  aliasing,  and  must  be  considered  in  the  determination 

[2-12 i 

of  the  sampling  rate  for  data  compression.  Figure  2-1  shows  a spectral 

representation  of  typical  data,  where  the  sampling  rate  (At  « 0.1  sec)  was 
far  greater  than  required  to  represent  the  highest  signficiant  frequency, 
in  this  case,  fg  •*  0.5  Hz.  Spectral  estimates  over  the  frequency  range  f > 
0.5  Hz  are  relatively  iow  amplitude  noise.  If  the  data  were  sampled  at  a 
reduced  rate,  for  example  At  < 1 second,  that  noise  would  be  aliased  into 
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tb«>  region  f < 0.5  Hz  with  some  adverse  effect  on  the  compressed  data.  To 
avoid  aliasing,  prefiltering  the  original  data  is  necessary.  This  may  be 
accoaplished  by  using  a low  pass  filter  with  a frequency  cutoff  equal  to 
the  highest  significant  frequency,  or  the  highest  desired  Nyquist  frequency. 

2.3.2  VARIABLE  RATE  COMPRESSION 

With  this  type  of  redundancy  reduction  the  waveform  is 

initially  sampled  at  a constant  rate  and  the  nonessential  samples  are 

eliminated  when  the  data  change  exceeds  a predetermined  tolerance  with 

respect  to  a reference  pattern.  The  choice  of  reference  patterns  used  to 

detect  redundancy  is  virtually  unlimited.  Examples  are:  polynomials,  expon- 

[2-7] 

entials,  and  sine  waves.  Of  the  many  techniques  the  most  widely  used 
and  discussed  are  the  polynomial  predictors  and  interpolators,  since  most 
data  can  be  expressed  or  approximated  in  that  form,  especially  over  the 
data  spans  to  be  tested.  A general  description  of  these  is  as  follows:^  ^ 

• A tolerance  window  is  placed  about  the  data  starting  at 
the  first  data  point. 

• Succeeding  points  which  fall  within  the  tolerance  window 
are  considered  redundant  and  are  discarded. 

• When  a succeeding  polat  falls  outside  the  window,  an 
appropriate  point  is  saved  and  a new  toleran-**  window  is  placed  about  the 
succeeding  data. 

e Each  time  a point  falls  outside  the  window,  a new  window 
is  used  for  the  succeeding  data. 

2. 3. 2. 1 PREDICTORS 

A pradletor  is  an  algorithm  that  estimates  the 
value  of  each  new  data  sample  basad  on  pas;  pjitormance  of  Che  d«*t*.  If  the 
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new  data  value  falls  within  the  tolerance  range  about  the  estimated  new 

value,  it  is  rejected  as  redundant  since  it  is  known  that  the  data  value 

[2-71 

can  be  constructed  within  that  tolerance  range,  A class  of  redundancy 

reduction  techniques  using  predictors  asruaes  t’  sample  will  follow 

(2-11 

an  n-th  order  polynomial  of  the  form 

x^  * ^ + ...  + Auxi  ^ (2.3-3) 

where  xi  is  the  predicted  sample, 
xi_1  is  the  previous  sample, 

AnXj_j  are  the  successive  differences  as  defined  in  subparagraph 
2.2.1.  A tolerance  of  t k can  then  be  established  about 

2 . 3 . 2 . 1 . 1 ZERO-ORDER  PREDICTORS 


CoHaonly  known  as  the  "Step  Method,"  the  zero*- 

order  predictor  is  the  simplest.  For  the  zero-order  predictor,  n * 0,  and 

C2-H 

equation  (2.3-3)  reduces  to4 

x.  3 x,  , (2.3-4) 

X l-”  i 

and  the  redundancy  test  is 


x.  . - k < x,  < x,  , <•  k (2.3-5) 
i-l  - i i~l 

Each  falling  the  test  is  saved  as  non-redundant 
and  is  used  as  the  new  reference  for  subsequent  tests.  This  method  is  also 
known  as  Che  iioatiog  point  aperature,  simply  because  the  tolerances  follow 
the  iafrui:  who's.  An  example  is  shown  in  Figure  2-2. 
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FIGURE  2-2.  Variable  rate  compression  methods. 
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2. 3.2. 1.2  FIRST-ORDER  PREDICTOR 


predictor 


Setting  n ■ 1 in  equation  (2.3-3),  the  first-order 


X « x 4*  Av 

i i-l  i-i 

- 2xi_!  - xi_2  (2.3-6) 

[2-11 

Is  obtained.  The  extrapolation  equation  is  a straight  line  drawn 

between  the  last  two  data  points.  Since  ^ represents  tixs  change 
between  the  previous  two  saaples,  the  predicted  saaple  is  the  previous 

[2-5] 

sample  plus  the  change  chat  occurred  between  the  previous  two  saaples. 

The  redundancy  test  is  the  same  as  that  shown  in  expression  (2.3-5).  When 
a sample  falls  outside  the  tolerance,  the  preceding  saaple  is  considered 

aonredundant. ^ 


It  follows  that  higher  order  predictors  can  be 

built  by  considering  aore  past  data.  Although  the  higher  order  predictors 

will  tend  to  provide  high  conpression  efficiency  on  highly  active  data, 

exDsrience  has  shown  that  a lev  order  predictor  will  provide  equal  or 

£2-71 

greater  compression  efficiency  for  most  telemetry  data. 

The  conpression  efficiency  is  basically  the  fidelity 
• f reconstructing  the  original  waveform  with  respect  to  the  amount  of 
redundancy  reduction.  Although  there  are  certain  trade-offs  with  respect 
to  the  variance  in  the  data  and  the  type  of  predictor  to  be  used,  general 
rules  have  been  established.  The  zero-order  predictor  is  prefectly  matched 
to  data  which  vary  as  step  functions,  such  as  data  calibrations  or  discrete 
events.  Because  of  horizontal  tolerance  lialts  the  zero-order  predictor  is 
at  a disadvantage  where  data  activity  is  high  with  many  vertical  series  of 
adjaccrt  points.  However,  in  the  presence  of  noise  only  the  zero-order 
predictor  tends  to  set  up  strictly  horizontal  limit  lines  which  are  automa- 
tically parallel  to  the  noisy,  actionless  data.  In  the  presence  of  noise 
spikes,  or  wild  points,  the  zero-order  predictor  works  well  since  it 


automatically  keeps  those  points  and  does  not  have  to  keep  a point 

for  every  vertical  increment  of  one  tolerance  magnitude.  The  noise 

can  then  be  effectively  compressed  if  the  tolerance  limits  are  suf- 

[2-91 

ficiently  wide. 1 1 


Because  the  first-order  predictor  is  responsive  to 

changes  in  the  data  it  generally  works  best  on  data  exhibiting  a high  level 

of  vertical  activity  and  relatively  low  noise.  A disadvantage  of  the 

first-order  predictor  is  the  possibility  of  getting  hung  up  on  heavy  noise, 

and  while  the  zero-order  predictor  is  handicapped  by  vertical  variations  of 

the  data,  noise  tends  to  reduce  the  efficiency  of  the  first-order  to  an 

even  greater  extent.  When  these  conditions  are  nixed,  i.e.,  high  noise- 

high  vertical  activity,  or  low  noise-little  vertical  activity,  the  two 

[2-9] 

methods  generally  perform  with  equal  efficiency. 

2. 3.2. 2 UfTERPOLATORS 

Prediction  techniques  are  based  on  the  assumption 

that  the  data  will  remain  relatively  constant  from  one  time  interval  to  the 

next.  If  the  data  vary  continuously  or  are  corrupted  sporadically  by 

noise,  the  redundancy  reduction  efficiency  of  the  predictor  generally  will 
12^1 

be  reduced. 1 "*  ' In  such  cases  the  compression  efficiencies  could  be 

increased  if  both  past  and  future  data  samples  could  be  used.  This  process 

of  determining  redundancy  after  the  sample  has  been  examined  is  called 

interpolation. ^ Interpolators  differ  from  predictors  in  that  all 

sample  values  between  the  last  transmitted  valuf;  end  the  present  value 

[2-11 

affect  the  interpolation.  Interpolation  uses  present  staples  to 

determine  where  past  samples  should  have  bees  and  compares  this  prediction 

[2-5] 

to  the  actual  position  of  the  past  sample. 

2. 3. 2. 2.1  ZERO-ORDER  DfTULPOLATOR 

The  zero-order  interpolator,  like  the  zero- 
order  predictor,  is  a horizontal  aperture  device  with  "step-wise"  tolerance 
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Halts.  However,  whereas  the  predictor  utilises  only  knowledge  of  the 

initial  saaple  value  ia  locating  the  aperture,  the  zero-order  interpolator 

operates  by  maximizing  the  length  of  tiae  the  original  waveform  stays 

[2—31 

within  the  aperture.  One  aethod  of  lapleaenting  this  is  to  place  one 

of  the  tolerance  bounds  at  the  first  point  and  consider  this  to  be  the 


maximum  or  aininua  value  in  the  redundant  data  set,  depending  on  the  slope 
of  the  curve.  The  aperture  is  initially  centered  at  xQ  ± k and  the  entire 
space  is  2k.  Whenever  a saaple  exceeds  the  2k  limits,  that  saaple  is  used 
to  initiate  the  next  tolerance  band  and  the  transaitted  saaple  is  the 
average  of  the  maximum  and  aininua  saaple  values  in  the  tolerance  band. 


(2-5] 


x , + x 
sin  max 


(2.3-7) 


.•here  * transaitted  saaple. 


a saailest  saaple  value  in  the  redundant  set. 


sex 


'argest  sample  value  in  the  redundant  set. 


The  spread  i«  • can  be  tolerated  in  the  zero-order  interpolator  is  strictly 
dependent  '■<?, the  predefined  error.  The  value  transaitted  is  approximately 
the  cencreid  Oi  that  redundant  data  set. 


2. 3.2.2.? 


FIRST-ORDER  INTERPOLATOR 


The  iapleaentation  of  the  first-order  inter- 
polator say  take  several  fora#;  however,  the  aost  conun  is  the  "Fan  Method" 
f.'cposed  by  Gardeahire.  This  involves  computing  two  slopes,  both  originating 
at  the  last  transmitted  saaple,  directed  to  the  upper  and  lower  tolerance 
Halts  of  the  next  sample.  These  slopes  are  used  to  test  the  subsequent 
saaple,  and  if  ir  falls  within  the  tolerance  limits,  a new,  more  restrictive 
fan,  defined  by  the  new  tolerance  limits,  is  used  to  test  the  subsequent 
point.  As  slopes  are  drawn  from  one  saaple  around  future  samples,  only 
the  most  restrictive  slope  above  and  the  most  restrictive  one  below  are 
stored.  The  implementation  of  this  is  relatively  simple  and  involves 
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little  data  storage  since  only  five  words  of  memory  are  necessary  - the 

two  slopes,  the  original  saaple,  the  last  sample  and  the  selection  tolerance 

[2-4] 

- regardless  of  hov  many  saaple s are  between  the  end  points.  Whenever 

a saaple  exceeds  the  tolerance  of  the  fan,  the  preceding  saaple  is  used 
as  the  origin  of  the  next  set  of  tolerance  fans. 

Since  future  saaples  aust  be  exaained  to 

deteraine  redundancy,  the  transaission  of  the  non-redundant  saaple  will  be 

delayed.  Thus  there  aay  be  a aajor  disadvantage  in  atteapting  to  use 

[2-5] 

interpolators  for  real-tiae  processing. 

The  predictors  use  only  past,  transmitted 

saaples  as  a basis  for  future  prediction  to  deteraine  redundancy.  However, 

since  they  use  the  set  of  future  data  to  determine  if  a particular  saaple 

should  be  transmitted,  they  have  a distinct  advantage  over  the  predictors. 

If  the  saaple  contains  noise,  the  noise  will  be  predicted  to  occur  in  the 

next  saaple.  Therefore,  that  saaple  will  probably  fail  the  redundancy 

test.  This  pattern  could  continue  at  each  succeeding  saaple  making  it 

difficult  for  the  predictor  to  provide  stable,  non-redundant  data.  By 

using  knowledge  of  future  variations  in  the  dsta,  interpolators  tend  to 

reduce  the  effects  of  noise  in  transmitting  non-redundant  saaples,  and 

[2-5] 

require  a lower  signal-to-noise  ratio  than  the  predictors. 


In  making  visual  comparisons  of  the  effi- 
ciency of  the  various  redundancy  reduction  techniques  on  telemetry  data, 
Lunsford  observed  that  the  first-order  predictor  tends  to  retain  data  peaks 
better  than  either  the  zero-order  predictor  or  the  first-order  interpolator. 
The  advantage  of  the  first-order  predictor  over  the  zero-order  algorithm  is 
that  the  first-order  limits  generally  have  slope  when  approaching  a data 
peak  so  that  the  upward  or  downward  trend  of  points  after  the  peak  is 
picked  up  sooner  than  if  the  lines  were  horizontal.  While  the  first-order 
interpolator  has  the  same  advantage  as  the  first-order  predictor,  it  does 
not  define  peaks  as  well  for  Identical  tolerances,  because  the  limits  for 
redundancy  are  wider. ^ ^ 
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On  the  basis  of  examining  sixteen  telemetry 
functions  with  both  predictors  and  the  first-order  interpolator,  along  with 
variations  of  these  methods,  Lunsford  concludes  that  the  zero-order  inter- 
polator should  compress  as  efficiently,  if  not  more,  than  any  of  the  other 
[2-9] 

three  aethods.  However,  a aajor  factor  affecting  th»  efficiency  of 

each  compression  algorithm  is  the  tolerance  selected.  Although  Figure  2-2 
illustrates  each  redundancy  reducing  method  with  essentially  the  saae 
tolerance,  the  optimum  tolerance  is  dependent  upon  the  technique  and  data 

' hd  racteristics. 


2. 3.2.3  TOLERANCE 

Once  the  decision  concerning  the  type  of  coapres- 

>i.M  method  to  be  used  Is  made,  the  size  of  the  tolerance  limits  must  be 

: retained.  Since  noise  is  essentially  random  redundant  data,  the  tolerance 

generally  should  be  set  large  enough  to  enable  the  algorithm  to  suppress 

noise.  Secondly,  the  tolerance  should  provide  a relatively  high  compression 

f2-91 

ratio  without  significantly  distorting  the  active  data.  A priori 

r-nowledge  of  the  data  characteristics  is  necessary  in  choosing  the  optimum 
"oierance.  The  compression  ratio  which  is  an  important  factor  in  determining 

the  effectiveness  of  the  compression  algorithm  and  tolerance  is  defined 

1 2-51 


„„  _ Total  number  of  samples  (2.3-8) 

UK  ~ ~ ; 

Number  of  significant  samples 

Figure  2-3  shows  the  increase  in  compression  ratio 

- [2-9 i 

for  different  redundancy  reduction  methods  over  a tolerance  range.  The 

tolerance  limits  are  expressed  in  percentoge  of  amplitude  bandwidth,  and 

the  vertical  scale  shows  the  average  compression  ratio  for  16  telemetry 

functions  tested.  These  curves  are  intended  to  show  approximate  relative 

increase;!  in  compression  ratios  vs  tolerances  for  each  predictor. 


2. 3. 2. 4 ERRORS 


f 


Because  of  the  various  uncertainties  in  choosing 

the  method  and  tolerance,  information  will  be  lost  and  fidelity  of  the 

original  data  will  be  affected.  Gardenhire  considers  the  tolerance  to  be 

an  estimate  of  the  maximum  guaranteed  error,  and  within  this  range  nrovides 

£2—41 

typical  error  distribution  curves  for  the  redundancy  reduction  methods. 

The  results  for  401  samples  and  a tolerance  of  .5%  are  shown  in  Figure 
2-4  for  the  three  methods.  The  curves  show  that  the  associated  error 
distributions  are  far  different  from  normal  distribution  curves.  For  the 
first-order  interpolator  the  errors  are  more  evenly  distributed  over  the 
entire  tolerance  band  while  for  the  first-order  predictor  they  peak  at  a 
very  low  error.  The  zero-order  interpolator  peaks  at  a higher  error,  but 
because  of  the  relative  distributions  its  mean  error  is  lower  than  that  of 
the  first-order  interpolator. 


2. 3. 2.5  RECONSTRUCTION 

Restoration  of  the  data  to  its  original  form 
•within  the  tolerances  already  determined  may  be  necessary  whenever  further 
processing  requires  that  the  data  be  input  at  a fixed  rate.  This  may  be 
necessary  at  the  receiving  end  when  the  data  is  compressed  for  transmission. 
Because  of  the  differences  in  the  algorithms  used  to  compress  the  data, 
there  are  some  considerations  which  affect  the  decompression,  or  reconstruc- 
tion problem.  Basically  the  reconstruction  method  is  determined  by  the 

[2—6] 

method  of  redundancy  reduction.  Zero-order  reconstruction  fills  in 

redundant  samples  which  are  equal  to  the  last  sample  transmitted  until  a 
new  sample  is  received.  The  first-order  reconstruction  process  basically 
consists  of  connecting  non-redundant  values  with  straight  lines  through 
linear  extrapolation. 

In  the  real-time  sense,  the  predictors,  which  were 
described  herein,  are  relatively  easy  to  decompress.  With  the  zero-order 
case,  all  redundant  samples,  which  must  be  reconstructed,  will  be  within 
the  original  tolerance  bounds  but  may  not  follow  the  original  waveform. 
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asp-jclally  if  the  tolerance  bounds  were  large,  Tlie  reconstruct!*.  i 
of  the  vedundanfc  data  removed  by  the  first-order  predictor  uses  cue 
fact  that  after  the  rlrst  two  samples  are.  transmitted),  ux,  the  change  xn  x 
which  uefincs  the  slope  of  the  tolerance  bounds,  is  known  and  can  ba  uaed 

to  reconstruct  the  redundant  samples  until  the  next  non-redundant  sample  Is 

_ [2-41 

transmitted. 

The  interpolator01  as  described  herein  present  a 
problem  for  reconstruction,  especially  when  data  are  sampled-for.  trans- 
mission. When  using  these  algorithms,  the  non-redundant  data  values  and 
slopes  are  not  known  ti'-r  transmitted  until  the  ongest  possible  line 
pvgment  has  been  fitted  to  t!'e  data.  This  mak<_  it  impossible  to  recon- 
struct the  original  data  without  imposing  some  delay.  The  delay  may  be  a 

biaior  problem,  especially  if  clie  data  values  remain  within  tolerance 

-T2-6I  - ~ " . 

during  the  entire  test.  - However,  because  -all  the  data  variations  on 

rbe  compression  end  are  known,  the  reconstructed  samples  tend  to  provide 

greater  fidelity  with  th®  original-  data. 

The  zero-order  interpolator  transmits  average 

estimates  of  the  data  in  the  tolerance  bands.  Therefore,  the  reconstructed 

data  tend  to  follow  the  most  likely  estimate  of  the  original  redur.daut 

data.  The  first-order  interpolator  has  a similar  advantage.  Since  all 

saapi®  points  fall  within  the  aperture  space,  ".here  is  no  excess  over  the 

j2^3  2-4] 

boundary  as  may  e^isc  with  the  t*„_  predictors.  ' 
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CHAPTER  3 

TRANSFORM  METHODS /LUMPED  PARAMETER  TECHNIQUES 


3. 1 INTRODUCTION 


Although  the  techniques  described  in  this  chapter  are  often  thought 
of  as  analysis  rather  than  data  compression  techniques,  they  can  be  used 
not  only  to  co«q>ress  data  output  from  a coaputer,  but  also  data  stored 
internal  to  it.  They  also  allow  the  user  to  sake  more  intelligent 
conclusions  than  could  be  attained  by  simply  inspecting  the  raw  data. 


In  real-time  data  reduction  it  is  imperative  that  the  test  conductor 
be  presented  with  information  he  can  assimilate  in  as  short  a time  as 
possible.  For  example,  is  it  desirable  to  reduce  a long  time  history 
into  a small  number  of  ccmputed  parameters  which  characterize  the  complete 
time  history,  or  to  combine  several  parameters  into  one  result  upon 
which  a decision  might  be  based?  The  methods  given  here  cannot  only 
save  considerable  time,  money,  and  paper  when  used  judiciously  in  assess- 
ing the  results  of  an  experiment,  but  alsc  will  give  more  incisive 
pinpointing  of  what  actually  happened  in  the  experiment.  When  properly 
utilized  in  teal  time,  the  test  conductor  or  flight  controller  can  leave 
the  display  room  with  full  knowledge  of  his  results  rather  th»,r.  waiting 
several  days  for  stacks  of  computer  listings  which  are  difficult  to 
assimilate. 


The  Fourier  Transform  and  Power  Spectrum  allow  display  of  informa- 
tion related  to  the  frequency  content  in  the  data.  The  Walsh  Transform 
allows  computation  and  display  of  information  related  to  the  mracer  of 
zero  crossings  in  the  data.  The  transfer  function  allows  representation 
of  large  quantities  of  data  collected  from  a complicated  system  by  a 
relatively  small  number  of  coefficients.  Non-dimensionalized  parameter? 
allow  combining  of  several  parameters  into  one;  both  for  reducing  the 
output  required,  and  for  ease  of  assimilating  information. 
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3.2  FOURIER  TECHNIQUES 


s 


t 
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Fourier  Analysis  has  been  a rich  area  in  applied  mathematics  for 
over  150  years.  However,  only  in  recent  years,  with  the  growth  of 
digital  computers  and  the  introduction  of  the  Fast  Fourier  Transform,  is 
the  full  potential  of  this  subject  being  realized.  The  ability  to 
readily  calculate  the  discrete  Fourier  Transform  provides  a very  appeal- 
ing data  compression  technique. 

The  definition  of  the  Fourier  Transform  of  a function,  f(t),  is 
given  by  the  well-known  Integral 

00 

F(U)  . j f(t)  e-iwtat  (3.2-1) 

with  the  reciprocal  formula  for  the  inverse. 

£(t)  * -nr  «(“)  eiut  <*.  <3-2-2> 

When  t is  time  then  w is  the  frequency  in  radians/sec. 


For  digital  data  the  Discrete  Fourier  Transform  **usr.  be  used  and  is 
defined  as 

u 

F(w)  * 75$ — £ £{kh)e~la>kh  (3.2-3) 

***  k«Q 

where  h is  the  sampling  interval. 

2 

The  highest  freq-jsscy  discernible  in  descrete  data  equals  r and  the 

1 n 

finest  resolution  between  frequencies  equals  Since  up2fff,  where  f 

is  the  frequency  in  cycles/sec,  then  the  above  definition  becomes 

F <“>  - r * F<3>  * 75T  .£„  f(kh»  « T1-  < 

A*v 
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(3.2-4) 


By  appropriately  rewriting  equation  (3.2-4)  advantage  nay  be  taken 
of  redundant  calculations  to  significantly  reduce  the  aaount  of  computing 
required  in  calculating  F(w).  Using  this  approach  in  the  early  1960s, 
Blackaun  and  Tukey  developed  the  Fast  Fourier  Transform  (FFT) . Today 
■ost  conputing  organizations  have  software  or  hardware  implementations 
of  the  FFT.  Hence,  it  is  possible  to  compute  the  Fourier  Transfom 
routinely  on  discrete  sequences  which  would  have  been  impossible  before 
the  FFT  was  developed.  Not  only  is  it  possible  to  perfons  this  computa- 
tion in  the  batch  mode,  but  it  is  also  possible  in  many  cases  to  perform 
it  in  near  realtime. 

A quantity  closely  related  to  the  Fourier  Transform  is  the  Power 
Spectrum.  This  function  is  defined  as  the  Fourier  Transform  of  the 
autocorrelation  function;  however,  it  can  be  shown  that  this  definition 
reduces  to  just  the  square  of  the  absolute  value  of  the  Fourier 
Transform.  That  is 

Gf(w)  - Re  (f(oi)}2  +■  ^ {f (co)>2  (3.2-5) 

This  function  gives  an  indication  of  the  distribution  of  the  power  as  a 
function  of  frequency  in  the  data  being  analyzed. 

The  Fourier  Transform  and  Power  Spectrum  can  be  used  for  certain 
categories  of  experiments  to  greatly  compress  the  amount  of  data  input 
required  to  assess  the  results  of  the  experiment.  The  most  extreme 
example  of  this  compression  can  be  seen  by  considering  the  case  of  a 
pure  sine  wave,  f(t)a>sin  us  t.  The  Fower  Spec  true  for  this  case  will  be 
the  delta  function,  6(u>-uso);  that  is,  all  the  power  in  the  function  is 
concentrated  at  uo.  For  the  realistic  case  of  finite  data  length,  the 
Power  Spectrum  will  be  represented  by  a spike  as  shown  in  Figure  3-1. 


Figure  3-1 
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He ace,  for  this  particular  case,  what  could  have  been  a sequence  of 
thousands  of  {Mints  in  the  tine  done  in  is  reduced  to  one  pertinent  point 
in  the  frequency  domain. 

Although  it  is  very  rare  that  a pure  sign  wave  is  encountered  in 
practice,  it  is  often  true  that  uost  of  the  energy  in  a parameter  is 
concentrated  in  a few  narrow  frequency  ranges  and  that  a good  approxima- 
tion of  the  parameter  is  given  by  a sum  of  sign  waves  in  these  ranges. 

With  some  a priori  knowledge  of  the  outcome  of  a test,  a test  controller 
can  limit  his  output  to  cover  the  frequency  range  of  interest  and  then 
not  only  significantly  reduce  the  quantity  of  data  output,  but  also  have 
the  results  in  a form  from  which  conclusions  con  be  drawn. 

The  Fourier  Transform  can  also  be  used  for  saving  computer  storage 
requirements  and  for  reducing  the  bit  rate  required  in  transmitting 
data.  In  many  cases,  the  Fourier  Transform  of  a signal  or  a curve  is 
dominated  by  relatively  few  of  the  F(j)  given  in  equation  (3.2-4).  In 
such  cases  only  the  F(j)  which  contribute  significantly  to  the  curve 
must  be  stored  or  transmitted.  By  storing  or  transmitting  only  those 
significant  F(j)  rather  than  the  complete  signal  in  the  time  domain, 
mass  memory  requirements  or  channel  bandwidth  requirements  can  be  signi- 
ficantly reduced. 

i 

3.3  TRANSFER  FUNCTIONS  j 

The  transfer  function  is  defined  as  the  ratio  of  the  system  Input 
to  the  system  output  in  the  Laplace  domain.  It  is  usually  used  to 

characterize  the  frequency  response  of  a system.  j 

j 

A constant  coefficient  linear  system  can  be  represented  by  the 
following  vector  differential  equation.  j 

||  * AX  + SO  (3.3-1)  ^ 

j 
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Here  X is  the  stste  or  output  vector,  U is  the  control  or  input  vector, 
A is  the  state  transformation  matrix  and  9 is  the  control  or  input 
matrix.  For  a multiple- input  multiple-output  system  the  vectors  U and  X 
contain  all  pertinent  input  and  output  parameters,  respectively. 


In  practice  it  is  desirable  to  know  the  transfer  function  of  one  of 
the  output  parameters  with  respect  to  one  of  the  Input  parameters.  For 
such  a case,  it  can  be  shown  that  the  relationship  between  the  input  and 
output  can  be  derived  from  equation  (3.3-1)  in  the  following  form: 


+ ai 


dn**XX 


d***1*! 


a X * b 
n ° dt" 


+ dt*"1  * ’**  b»U 

Here,  x and  u are  particular  components  of  X and  U. 


(3.3-2) 


The  transfer  function  H(s)  is  obtained  by  taking  the  Laplace  Trans- 
form of  both  sides  of  (3.3-2)  and  obtaining  the  ratio  of  X to  0,  to  give 
the  following: 


H(s,  - iiff  - -2 
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(3.3-3) 


Hence,  the  set  of  coefficients  denoted  by  the  ^'s  and  b^'s  characterize 
the  relationship  between  * and  y.  If  i is  substituted  for  s,  then  H 
becomes  the  system  frequency  response  function. 


The  transfer  function  can  be  calculated  in  several  ways,  among 
them:  Fourier  Transform,  Z-transform,  and  parameter  identification.  In 
the  Fourier  Transform  technique,  the  discrete  Fourier  Transform  of  both 
the  input  and  output  is  taken  and  substituted  Into  the  left  hand  side  of 
(3.3-3).  Then  a rational  function  numerical  fit  is  made  to  the  trans- 
formed data  to  give  the  a^'s  and  b^’s. 
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In  the  Z transfora  technique,  the  transformation  Z»e  is  used  to 
convert  the  differential  equation  (3.3-2)  to  a difference  equation. 

Data  values  in  the  time  domain  are  then  substituted  into  the  difference 
equation,  usually  giving  an  overdetermined  systea  of  linear  equations 
with  the  a^s  and  b^'s  being  the  only  unknowns.  The  a^'s  and  b^'s  are 
then  solved  for  by  the  method  of  least  squares. 

In  the  parameter  identif i cat ion  method,  the  matrices  A and  B are 
usually  determined  by  finding  those  respective  values  which  will  give 
the  solution  X(t)  of  (3.3-1)  which  most  closely  matches  a set  of  measure- 
ments of  X,  given,  also,  measurements  for  6-  There  are  several  techniques 
used  in  parameter  identif lcatioc;  among  them  are  maximum  likelihood, 
Kevton-Rapheson,  and  Quasiiinearization.  The  coefficients  a^  and  can 
be  easily  determined  from  the  matrices  A and  B.  The  details  of  parameter 
identification  techniques  are  beyond  the  scope  of  this  document.  However, 
lurther  detail  may  be  found  in  references  (3-3l  and  [3-4J 

For  systems  which  are  approximately  linear,  the  transfer  function 
can  be  used  to  reduce  a long  time  history  of  data  for  systea  output  and 
input  to  a small  set  of  coefficients  which  rel-te  the  two.  Also,  by 
looking  at  the  roots  of  the  numerator  and  denominator  of  the  transfer 
function,  we  can  determined  the  stability  characteristics  of  the  system. 
Hence,  a test  conductor  who  is  analyzing  his  data  in  a near  real-time 
mode  will  immediately  have  all  the  information  needed  to  make  decisions 
on  the  test.  In  this  case,  cot  only  would  a large  stock  of  tabulated 
data  be  awkward  to  work  with,  it  would  also  not  provide  him  with  the 
information  needed  to  assess  the  results  of  the  test.  Hence,  using  the 
transfer  function  not  only  reduces  the  quantity  of  d„ta  output,  but  also 
provides  the  user  with  information  in  a form  conducive  to  decision 
making. 

3.4  WALSH  TRANSFORMS 

The  Walsh  Transform  is  analagous  to  the  Fourier  Transform  in  that  a 
function  or  signal  is  represented  by  a series  of  orthogonal  functions. 
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Just  as  the  Fourier  Transform  is  useful  for  representing  signals  composed 
of  oscillatory  components,  the  Walsh  Transform  is  extreaely  useful  in 
representing  signals  composed  of  a number  of  discrete  level  changes.  The 
analog  to  the  frequency  for  the  Fourier  Transform  is  the  "sequency"  or 
number  of  zero  crossings  for  the  Walsh  Transfora. 

The  orthagonal  functions  used  in  performing  the  Walsh  Transfora  are 
known  as  Walsh  Functions.  The  first  16  Walsh  Functions  are  shown  in 
Figure  3-2, 

If  f is  a data  vector  of  length  N;  then  the  one-dimensional  Walsh 
Transform  F of  f is  defined  as 


where  W is  an  NXJi  matrix,  the  rows  of  which  are  the  sampled  Walsh 
Functions  V. . The  inverse  transfora  is  given  by 

1 * -7S-  wn?  <2.4-2) 

Hence,  the  forward  and  Inverse  transforms  can  be  implemented  by  the  same 
hardware  and  software. 

The  Walsh  Transfora  can  be  used  for  data  compression  in  a similar 
manner  to  the  Fourier  Transform.  In  certain  cases  only  a relatively 
small  number  of  the  elements  of  F are  significant.  In  such  cases  only 
these  significant  components  need  be  retained  for  a large  savings  in 
memory  or  channel  bandwidth  to  be  achieved.  The  signal  can  be  reconstructed 
using  equation  (3.4-2)  with  the  insignificant  coaponents  set  to  zero. 

This  technique  has  been  especially  useful  in  reducing  bandwidth 
requirements  for  transmitting  digitized  video  signals.  Iu  this  case  the 
screen  image  is  coaposed  of  a relatively  few  discrete  shades.  The  Walsh 
Transform  is  highly  suited  for  representing  the  signal  which  generates 
these  shades.  References  [3— 5 J and  {3-9J  give  further  details  on  the 
use  of  this  technique. 
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3.5  NON-DIMENSIONALIZED  PARAMETERS 


i 


Non-dimensionalized  parameters  have  been  used  for  many  years  by 
aerodynamicists  for  characterizing  aerodynamic  forces  and  moments  in 
fluid  flows.  These  non-dimensionalized  parameters  can  be  viewed  as  « -ta 
compressors  in  that  they  lump  together  several  parameters  into  one 
parameter.  As  in  cases  discussed  previously,  this  reduction  also  usually 
means  that  the  lumped  parameter  can  be  more  easily  interpreted  than  the 
several  quantities  could  be  separately. 


For  example,  in  incompressible  viscous  fluid  flow  through  pipes, 
the  Reynold’s  number,  which  is  a non-dimensioned  parameter  made  up  of 
four  physical  quantities:  density,  viscosity,  pipe  diameter,  and  flow 
velocity;  uniquely  determines  the  value  of  the  resistance  coefficient 
for  a given  surface  geometry  of  the  pipe.  Hence,  there  i3  no  need  to 
obtain  data  at  all  possible  densities,  pipe  diameter,  and  flow  velocities, 
but  only  to  run  experiments  at  varying  values  of  the  Reynold's  number. 

The  extent  to  which  a group  of  related  quantities  can  be  reduced  to 
dimensionless  parameters  is  governed  by  the  Buckingham  theorem.  This 
theorem  states  that,  given  a physical  equation  f X^^jX^,  — X^)=0, 
where  the  X.'s  are  dimensional  physical  quantities  related  to  the 
physical  phenomenon  of  interest,  that  there  can  be  N-M  dimensionless 
quantities  describing  the  same  phenomenon,  given,  as  follows: 


f(X^,  X2,  X^,  X^j)  — 0 (TT^j’n^j'TC  ...  0 


(3.5-1) 


where  M is  the  number  of  fundamental  physical  dimensions  in  the 
problem.  In  pure  mechanics  problems,  the  fundamental  units  are  mass, 
length,  and  time.  Hence,  by  non-dimensionalizing,  the  number  of  quanti- 
ties to  be  considered  can  be  reduced  by  three. 


As  for  the  previous  cases  discussed,  not  only  is  a reduction  in 
quantity  of  data  achieved,  but  also  it  is  easier  to  assimilate  the 
-esults  of  a test  by  considering  the  reduced  3et  of  dimensionless 
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parameters  instead  of  the  complete  set  of  physical  quantities.  Thus, 
for  example,  a test  conductor  should  be  able  to  gain  considerably  more 
information  from  viewing  a force  coefficient  than  from  viewing  separately 
the  force,  density,  and  velocity  which  constitute  the  force  coefficient. 

In  conclusion,  before  any  new  data  analysis  is  set  up,  careful  considera- 
tion should  be  given  to  using  appropriate  non-dimensionalized  parameters 
for  reducing  the  quantity  of  data  to  be  output. 

3.6  PITFALLS 

The  methods  described  in  this  chapter  can  be  extremely  useful  in 
_omp res sing  data  or  increasing  the  information  content  of  data  to  be 
lesented.  However,  as  is  true  with  any  mathematical  technique,  extreme 
are  should  be  taken  in  using  these  methods.  The  user  should  be  as 
familiar  as  possible  with  the  physical  phenomenon  which  is  being  repre- 
sented and  should  make  a careful  assessment  of  whether  the  technin-  :s 
here  are  applicable  to  his  problem. 

The  Fourier  Transform  can  give  erroneous  results  when  improperly 
used.  When  a truncated  Fourier  series  is  used  on  non-periodic  data, 
.purious  oscillations  can  be  induced  when  the  inverse  is  taken.  This 
property,  known  as  the  Gibbs  Phenomenon,  is  described  in  detail  ir.  any 
.;ood  reference  on  Fourier  Transforms.  Analogous  errors  are  also  intro- 
duced because  of  the  finite  data  length  in  the  time  domain.  If  the 
Fourier  Transform  is  blindly  applied,  the  user  may  find  that  a signifi- 
cant compression  ratio  has  been  achieved  at  the  expense  of  losing  all 
the  relevant  information  in  the  data.  Similar  pitfalls  can  occur  in  the 
use  of  the  Walsh  Transform. 

The  transfer  function  can  also  be  abused  as  a data  compression 
device.  The  most  common  pitfall  occurs  when  the  system  from  which  the 
data  is  taken  is  not  adequately  described  by  a set  of  constant  coefficient 
differential  equations.  For  example,  the  system  may  contain  significant 
non-linearities  or  time  varying  coefficients.  In  such  cases,  the  coeffi- 
cients in  the  transfer  function  will  not  give  faithful  reconstruction  of 
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the  original  data  and  will  give  an  erroneous  picture  of  the  process 
under  test. 

The  only  pitfall  which  can  occur  from  use  of  non-diaensionalized 
parameters  is  incorrect  modeling  of  the  system  under  test.  However, 
careful  modeling  should  always  be  done,  regardless  of  the  data  collection 
or  data  analysis  technique  to  be  used. 


Avoidance  of  the  pitfalls  listed  here  is  accomplished  through 
careful  study  of  a technique  and  how  it  applies  to  the  physical  process 
being  tested.  If  possible,  the  system  should  be  modeled  and  a simulation 
developed.  The  data  compression  technique  being  considered  should  then 
be  tested  on  the  simulated  data.  After  the  compression  has  been  achieved, 
the  data  should  be  reconstructed  to  determine  how  much  information  was 
lost  during  the  compression.  The  user  should  then  choose  the  technique 
which  gives  the  best  compromise  between  compression  ratio  and  fidelity 
of  the  reconstructed  data  to  the  original  data. 
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CHAPTER  4 

STATISTICAL  REPRESENTATION 


4.1  INTRODUCTION 

Various  statistical  parameters  are  used  to  describe  large  groups  of 
data.  After  the  parameters  are  computed,  the  basic  data  may  be  stored 
or  discarded.  Other  statistical  techniques  may  be  used  to  discard  some 
individual  pieces  of  data.  The  subjects  in  this  chapter  are  discussed 
briefly.  For  details  the  reader  is  referred  to  the  references. 

4.2  PARAMETER  ESTIMATION 

In  this  paragraph,  a group  of  data  will  be  referred  to  as  a sample. 
In  order  to  summarize  the  information  in  a sample,  certain  representative 
values  must  be  calculated.  These  representative  values  fall  into  two 
groups.  One  group  measures  the  central  tendency  of  the  sample  and  the 
other  measures  the  dispersion  of  the  sample.  Usually  values  from  both 
groups  are  needed  to  summarize  the  sample. 

4.3  MEASURES  OF  CENTRAL  TENDENCY 


The  most  common  measure  of  central  tendency  is  the  arithmetic  mean. 
If  these  are  n values,  Xj,  X2«..Xn  in  a sample  the  arithmetic  mean,  X, 
is  calculated  by  the  formula 


V “ I 
x * " i-i 


*» 


(4.3-1) 


Two  properties  of  the  arithmetic  mean  are  (1)  the  sum  of  the  deviations 
from  the  mean  are  zero  and  (2)  the  sum  of  squares  of  the  deviations  from 
the  mean  is  less  than  the  sum  of  squares  of  the  deviations  from  any 
other  value.  The  arithmetic  mean  has  the  following  advantages:  (1)  it 
is  easily  calculated,  (2)  it  is  easily  understood,  (3)  it  is  commonly 
used,  and  (4)  it  lends  itself  to  algebraic  manipulation.  On  the  other 
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hand,  it  has  the  disadvantage  of  being  quite  sensitive  to  extreme  values 
and  may  be  far  from  representative  of  the  sample. 
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The  midrange  is  a representative  value  which  may  be  used  to  approxi- 
mate the  arithmetic  mean.  The  midrange,  M^,  is  calculated  by  the  formula 

“r  * 2 (XMIN  + XKAX)  (4.3-2) 

It  is  simply  the  arithmetic  mean  of  the  largest  and  smallest  values  in 
the  sample. 

It  has  the  advantage  of  being  easily  and  quickly  calculated.  Since 
it  ignores  the  intermediate  values,  midrange  has  the  disadvantage  of 
being  unrepresentative  if  either  the  maximum  or  minimum  value  is  atypical 
o f the  values  in  the  sample. 

The  median  is  often  used  to  describe  a sample.  The  median  is  that 
alue  for  which  half  the  values  in  the  sample  are  less  than  the  median 
value  and  half  greater.  When  the  sample  values  are  arrayed  in  order  of 
magnitude  from  lowest  to  highest,  the  median,  M,  is  the  (n+l)/2  value. 

If  there  are  an  even  number  of  observations,  the  median  is  the  arithmetic 
mean  of  the  two  middle  values;  i.e.,  for  r.  values,  X^,  where  n is  even, 
the  median  is 


M “ 5 (X, 


n/2 


+ w 


(4.3-3) 


If  there  are  an  odd  number  of  values,  the  median  is  the  middle  value; 
i.e.,  for  n values,  X.,  where  n is  odd,  the  median  is 


M = 


(4.3-4) 


The  median  is  easy  to  calculate  and  is  often  more  typical  of  the  data 
than  the  arithmetic  mean  since  it  is  not  affected  by  extreme  values. 
Some  disadvantages  are  (1)  that  the  values  must  be  sorted  and  arrayed 
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before  the  Median  is  computed,  (2)  it  does  not  lend  itself  to  algebraic 
Manipulation,  and  (3)  if  the  data  fall  into  tvo  distinct  groups  it  could 
be  Misleading.  Theoretically  the  probability  is  one  half  that  an  observation 
selected  at  randoe  will  be  less  than  the  Median.  The  sub  of  the  absolute 
values  of  the  deviations  fro#  the  Median  is  less  than  the  sum  of  absolute 
values  from  any  other  value.  When  there  are  several  saMple  values  which 
are  identical,  the  Median  May  not  have  half  the  saaples  below  and  above 
that  theory  indicates. 

The  data  May  be  described  by  retaining  only  points  which  divide  the 
saaple  into  convenient  groups.  One  such  division  is  the  division  into 
percentiles.  A percentile,  Pp,  is  that  value  for  which  pZ  of  the  values 
are  less  than  Pp  and  (IOO-p)Z  of  the  values  are  greater  than  P^.  When 
the  values  are  arrayed  in  order  of  Magnitude,  then  Pp  is  the  p(n+l)/100th 
value  if  p(n+l)/100  contains  a fraction;  then  the  value  la  a linear 
interpolation  between  the  two  values  on  either  side.  If  the  value 
p(iH-l)/100  falls  outside  the  data,  use  the  first  or  last  value,  whichever 
is  appropriate. 

As  a slnple  example,  consider  the  following  set  of  data:  1,  2,  2, 

3,  4,  5,  5,  5,  9,  11.  The  95th  percentile  is  the  95 <11)/ 100  * 10.45 
value  or  11.  The  80th  percentile  is  the  80(11)/ 100  *8.8  value  or  5 + 

0.8(4)  * 8.2.  The  20th  and  25th  percentiles  are  both  2.  The  Median  is 
the  50th  percentile  and  in  this  example  is  4.5.  The  10  percentile 
numbers  are  referred  to  as  deciles  and  the  25,  50,  /5,  and  100  percentile 
nuabers  are  referred  to  as  quartiles. 

The  mode  is  the  most  frequent  value  that  appears  In  the  saaple.  In 
the  example  In  the  previous  subparagraph,  the  mode  is  5.  There  can  be 
several  nodes  in  a given  saaple.  If  all  values  in  a saaple  are  different, 
then  there  is  no  node.  When  any  value  occurs  more  frequently  than  its 
neighbors,  it  is  referred  to  as  a relative  node.  The  aost  frequent 
value  is  called  the  absolute  node.  There  can  be  several  absolute  '“odes. 
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4.4  MEASURES  OF  b.  'OP®SION 


The  measures  of  central  tendency  do  not  describe  the  spread  of 
values.  Three  coaoon  values  of  dispersion  are  discussed  here. 

The  Range,  R,  is  the  difference  between  the  maxima  and  minimum 
values.  It  is  calculated  by  the  formula 

R * 1Sm  - ’Sun 

The  range  is  easy  to  calculate  but  has  the  disadvantage  that  it  ignores 

intermediate  values. 


The  variance  and  standard  deviation  may  be  considered  together. 

The-  standard  deviation  is  the  positive  square  root  of  the  variance.  For 

2 

this  reason  the  variance  is  referred  to  by  the  symbol  s and  the  standard 
deviation  by  s.  There  are  two  ways  to  compute  s4.  one  uses  n,  the 
number  cf  values  in  the  sample,  in  the  denominator.  The  other  uses  n-i . 
Both  ways  may  be  shown  using  the  same  formula.  The  method  which  uses  n 
provides  a biased  estimator  of  the  popular ion  variance  while  that  with 
n-1  provides  an  unbiased  estimator.  An  estimator  is  unbiased  if  its 

expected  value  is  equal  to  the  population  parameter.  The  expected  value 

2 n- 1 2 2 

ot  s',  when  n is  used  in  the  denominator,  is  — — a , where  a is  the 

n 

population  variance.  The  formulas  given  below  are  equivalent  and  selec- 
tion of  the  one  to  use  should  be  made  by  determining  which  cne  is  the 
easiest  to  calculate.  The  first  one  given  is  usually  the  easiest  for 
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at  ions.  If  the  sample  has  n values, 

n n 2 

_ Z x2  (L  x ) 

i-1  i ifU_/n  v = n for  biased  estimator 


then 


(4.4-2) 
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{Z  X . )/n 
i-1  1 ,u 


(4.4-3) 

n-1  for  unbiased  estimator 


n _ 2 

(3)  s mL  " X)  __  (4.4-4) 

i-1 . X = arithmetic  mean  of  the  sample. 


The  coefficient  of  variation,  C^,  la  a measure  of  relative  variation. 
It  is  computed  fro*  the  formula  C • s/X  (4.  -*-5) , where  a la  the  sample 
standard  deviation  and  X is  the  arithmetic  *ean  of  the  sample.  It  has 
been  observed  that  saaples  with  numerically  large  values  tend  to  vary 
widely  and  those  with  numerically  small  values  tend  to  vary  nairowly. 

In  order  to  make  a comparison  of  the  variation  among  two  groups  of  data 
with  different  magnitudes,  the  coefficient  of  variation  may  be  used.  It 
can  be  usea  to  compare  the  variation  in  two  samples  which  are  measured 
in  two  different  units;  e.g. , s comparison  of  variation  in  height  with 
variation  in  weight. 


4.5  COEFFICIENTS  IK  A HATH  MODEL 

A math  modal  is  simply  an  equation  which  relates  an  observed  value, 
Y,  to  one  or  more  known  values,  X£.  In  practical  cases  most  math  models 
are  linear.  The  reason  is  that  linear  equations  are  easy  to  manipulate 
and  calculate.  A math  modal  is  then  of  the  fora 

n 

Y ■»  E bl  where  there  are  n+i  known  values (4. 5-1) 
i-0 

The  linear  fora  can  be  used  to  deal  with  very  general  situations.  In 
the  case  of  a trajectory,  position  is  represented  by  a second  degree 
equation  in  time,  viz, 

y * ho  + bjt  + b2t2  <4.5-2) 

2 

If  we  let  %Q  * l,  Xj  * t,  ard  ■ t , the  second  degree  equation 
in  one  variable  can  be  transformed  into  a first  degree  equation  in  three 
variables. 

An  equation  of  the  form  y - ax^  can  be  transformed  into  s linear 
equation  by  taking  logarithms.  Specifically, 

log  y - leg  a + b log  X (4,5-3) 
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Here,  Y - log  y,  be  * log  a,  bj  * b,  XQ  ■ I,  Xj  m log  4. 

A common  Method  of  determining  the  coefficients  is  to  apply  the 
method  of  least  squares.  To  use  the  Method,  several  observations  of  Y 
gust  be  taken  for  various  known  values  of  the  X^’s.  Let 
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Y is  : he  satrix  of  n observed  values. 

X is  t W na trix  of  the  n known  paints  of  the  k *■  i»  X^'s. 

B is  the  B.-£!ci*  of  the  c + l coefficients  of  the  s. 
c,  ts  the  ontri*  of  observational  errors. 

,'ov  Y = XB  + C (4.5-4) 

is  the  matrix  equation  of  observations.  The  gethod  of  least  squares 
assumes  the  errors  to  be  independent,  with  Mean  zero,  and  a cos»on 
variance, c~  (i.e.,  they  are  hoscscedastic) . The  gethod  of  least  squares 
finds  the  values  of  the  coefficients  which  minimize  the  sum  of  the 
squares  of  the  residuals.  The  syabol  " , above  a variable,  will  indicate 
that  it  is  an  estimate  of  that  parameter.  The  sum  to  be  minimized  is 


S - r®  - (Y  - XB)  '(Y  - XB) 


(4.5-5) 
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A symbol,  indicates  the  transpose  of  the  matrix. 

When  the  partial  derivatives  are  taken,  set  equal  to  zero  and 
manipulated  a bit,  the  following  result  Is  obtained: 

X'XB  - X'Y  (4.5-6) 

This  is  the  matrix  fore  of  what  are  termed  normal  equations. 

B - (X'X)”l(X'Y)  (4.5-7) 

2 2 

The  estimate  s of  ^ is  given  by 


1 

s 


1 

n-(k+I) 


(Y'Y  - B'X'Y) 


(4.5-8) 


-1 

The  matrix  (X  X)  is  tbe  variance-covariance  matrix  of  the  variances  of 
the  b’s. 


Let  be  the  estimate  of  the  variance  of  b^  and 


covariance  of  b^  and  . 


Then  the  diagonal  elements  of 


^bibj 


be  the 


2 —1 
s (X  X) 


are  the 


values  of  the 


, then 


- s2  dlftg  (*'*)-' 


(4.5-9) 


the  values  of  che  variances  say  be  used  Co  determine  confidence  intervals 
for  che  estimates  of  cbe  coefficients. 

4.6  CORRELATION  AND  REGRESSION  ANALYSIS 

Correlation  aeans  the  degree  of  association  aaong  variables.  The 
quantities  used  to  measure  the  correlation  are  termed  correlation  coeffi- 
cients.  Regression  is  a tern  for  the  methods  used  to  determine  the  best 
functional  relationship  among  variables.  In  statistics,  when  a dependent 
variable  is  expressed  as  a function  of  one  or  more  independent  variables, 
the  function  is  termed  a regression  function.  In  other  areas  it  is 
sometimes  termed  a response  function.  The  statistical  analysis  of  a 
regression  function  and  the  determination  of  the  coefficients  may  not 
mean  that  a casual  relationship  must  be  made  by  a person  well  trained  in 
the  subject  scatter  field  in  which  the  test  was  made. 

A regression  function  is  a math  model.  The  discussion  of  least 
squares  which  appears  in  paragraph  4.5  also  applies  here.  Polynomials 
of  degree  a say  be  considered  as  linear  functions  with  the  m+1  variables 
X , ...  X where  X =1,  and  X,  is  the  ith  power  of  the  variable.  Non- 

G Wt  o i 

linear  functions  can  often  be  linearized  by  a proper  transformation. 

After  the  coefficients  are  computed  they  must  be  converted  to  the  original 

terns.  The  example  y = a3  shown  in  paragraph  4.5  (equation  4.5-3)  would 

have  a ■ 10  , and  b » b,  where  b and  b,  are  obtained  from  the  linear 

l o 1 

expression. 

It  Is  possible  to  determine  the  goodness  of  fit  by  examining  the 
variance  and  sums  of  squares  of  the  variables.  Such  an  examination  is 
called  an  Analysis  of  Variance.  In  the  ease  of  polynomials  it  is  possible 
to  decide  whether  the  last  term  added  has  any  significance.  In  general, 
it  is  possible  to  determine  if  several  coefficients  are  significantly 
different  from  zero.  The  case  of  deciding  whether  a number  of  coeffi- 
cients are  different  from  zero  is  discussed  here.  An  Illustrative 
example  is  shown  in  the  next  paragraph.  All  the  symbols  and  their 
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definitions  are  the  state  as  those  used  in  paragraph  4.5  concerning  the 
coefficients  of  math  models.  Assume  that  it  is  desired  to  know  If  the 
last  p<fe*T  of  the  coefficients  are  significantly  different  from  zero. 

Tc  do  this  create  two  new  matrices,  X^  and  where  X^  is  the  matrix 
formed  by  removing  from  7.  the  p colons  that  correspond  to  the  suspect 
coefficients  and  By  is  the  matrix  formed  by  removing  from  B the  appropri- 
ate p coefficients.  Solve  the  reduced  set  of  equations.  This  solution 
is: 


£ * (X'X  rVY)  (4.6-i) 

V V V V 

...  -2  1 (Y'Y  - B'X'Y)  , « 

with  variance  Sy  - n-{k+V-p)  * v v ' (h.6-2) 

The  following  table  should  then  be  computed.  This  Is  called  an 
analysis  of  variance  table.  The  mean  square  column  is  the  sum  of  squares 
divided  by  the  degrees  of  freedom.  The  table  is  adapted  from  reference 
[4-9]  as  is  the  explanation  following. 

TABLE  4-1  ANALYSIS  OF  VARIANCE  TABLE 

Source  of  Variation Degrees  of  Freedom  Sum  of  Squares  Hean  Squares 

Total 

Due  to  fe+1  constants 
Residual  (from  large  solution) 

Due  to  fc+l-p  constants 
Residual  (from  reduced  solution) 

Due  to  additional  p constants 


n 

fe+1 

n-{fe+l) 

fe+l-p 

n-(fe+l-p) 

P 


Y'Y 

B'X'Y 


Y'Y-B'X'Y 

B'X'Y 

v v 

Y'Y-B'X'Y 

v V 

B'X'Y-B'X'Y 


Y'Y 
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K 

f = ~2  is  distributed  as  F with  degrees  of  freedom 

Q 

fe+1,  n-(fefl)  and  serves  as  a test  of  whether 
all  fe+1  constants  account  for  a significant 
reduction  in  the  error  variance. 

p 

F = is  distributed  as  F with  degrees  of  freedom 

p.  n-(fe+l)  and  serves  as  a test  of  whether  the 
addition  of  the  p coefficients  accounts  for  a 
significant  reduction  in  the  error  variance 
over  that  accounted  for  by  the  first  fc+l-p 
constants . 

The  following  ijlustrative  example  was  adapted  from  reference 
‘.i-'j].  The  notation  has  been  changed  to  conform  to  that  used  in  this 
napter.  The  numbers  and  computations  are  taken  directly  from  the 
: c ference.  The  data  are  represented  in  tabular  form  below. 

TABLE  4-2 


This  corresponds  to  the  situation 


Xo  + b!  x!  + b2  x2 


('•,6-3) 


definitions  are  the  sane  as  those  used  in  paragraph  4.5  concerning  the 
coefficients  of  nath  models.  Assume  that  it  is  desired  to  know  if  the 
last  p<fefl  of  the  coefficients  are  significantly  different  from  zero. 

To  do  this  create  two  new  matrices,  X^  and  B^,  where  X^  is  the  matrix 
formed  by  removing  f-ia  X the  p columns  that  correspond  to  the  suspect 
coefficients  and  B 3 the  matrix  formed  by  removing  from  B the  appropri- 
ate p coefficients.  Solve  the  reduced  set  of  equations.  This  solution 
is: 


with  variance 


v 


(X'X  )-1 (X"Y ) 

V V V 


n-(fe+1-p) 


(Y'Y  - B'X'Y) 
1 v v 


(4.6-1) 

(4.6-2) 


The  following  table  should  then  be  computed.  This  is  called  an 
analysis  of  variance  table.  The  mean  square  column  is  the  sum  of  squares 
divided  by  the  degrees  of  freedom.  The  table  is  adapted  from  reference 
[4-9]  as  is  the  explanation  following. 


TABLE  4-1  ANALYSIS  OF  VARIANCE  TABLE 


o c Variation  Degrees  of  Freedom 

Totc.1  n 

Due  to  feH  constants  fe+1 

Residual  (from  large  solution)  n-{fe+l) 

Due  to  fc+l-p  constants  fe+l-p 

Residual  (from  reduced  solution)  n-(fe+l-p) 

Due  to  additional  p constants  p 


Sum  of  Squares 
Y'Y 


Y'Y-B'X'Y 

B'X'Y 

V V 

Y'Y-B'X'Y 

V V 

B'X'Y-B'X'Y 

V V 


Mean  Squares 

1 Y'Y 
n 

K 

s2 

A 

s2 

V 

P 
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s 


= n^ri+!rv'Y  * B'X'Y)  = i (5*808  473  °38) 

= 1.936  157  679 
s =1.391  4588 
sb£)  = 0.323  627 
sbl  = 0.139  054 
sb2  = 0.205  283 


io  test  the  significance  of  b2>  the  last  coition  is  dtopped  from  X. 


Then  X'X  = i 

V V 


50  67 

L 67  194 j 

f 194 


'XvX  J1  = 5211  | 


-67 


X'Y  = 


54 

97 


1 


6 = (X.'X  )_1  (X'Y)  = 


-67" 

50 


0.763 

0.236 


193 

245" 

422 

951 

P = 1 (B'X'Y  - B X'Y)  = i 
D v v 1 

P = 0.046066 


r-  _ P _ 0.046066  _ „ 

F ■ 72  ■ T7536T56  - °-024 


{64.191  527  - 64.145  461) 


The  numerator  has  1 and  the  denominator  3 degrees  of  freedom.  At  the 
95Z  confidence  level, 

F0.95°’3)  “ 10*13 

Since  F<10. 13,  b2  is  not  regarded  as  being  significantly  different, 
statistically,  from  zero.  Therefore,  it  nay  be  disregarded. 


The  preceeding  discussion  about  least  squares  has  been  limited  to 
the  case  where  the  variances  of  the  observations  were  Independent  and 
equal.  For  a discussion  of  the  cases  where  the  observation  errors  are 
not  equal  and/or  not  independent  the  reader  is  referred  to  references 
{4-2]  or  [4-9], 


Correlation  only  tells  how  well  variables  are  related.  The  correla- 
tion coefficient,  r,  between  two  sets  of  data,  each  having  n values,  is 
computed  by  the  following: 


I <XrX)(YrY) 

s 1*1  ' 1 

^ _ p n o H 

[E  (xrxr  E(Yrr)z] 

i*l  1 i*l  1 


(4.6-4) 


where  the  X^s  and  y^'s  are  the  values  in  the  two  sets  of  data,  X is  the 
arithmetic  mean  of  the  X^'s  and  Y is  the  arithmetic  mean  of  the  Y^'s. 

The  range  of  r is  -l£r<l.  If  the  data  are  perfectly  correlated 
|rj*l.  If  the  data  are  uncorrelated  r*0.  Perfectly  correlated  means 
there  is  an  exact  linear  relationship.  If  r>(^  the  slope  of  the,  fitted 
line  will  be  positive.  If  r<0,  the  slope  of  the  fitted  line  will  be 
negative. 

Of  more  importance  in  data  compression  is  serial  correlation.  For 
a set  of  data  which  is  not  random  there  will  be  dependencies  between 
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successive  terms.  Serial  correlation  is  used  to  measure  these  dependen- 
cies. The  coefficient  of  serial  correlation  of  log  fe.  is  the  correlation 
coefficient  between  pairs  of  terms  k units  apart.  Suppose  a set  of  data 
contains  n points  X^,  ^...X^.  The  serial  correlation  coefficient 
of  log  k is  given  by 


H-K.  1 ll-K  n ll-K. 

il]  (Xi"n7fe  1-E1Xi ^Xi+fe“rT7fe 

H7!  \ n7!  7 n-fe  ] ivE  ZV 

n=e  1i1X1)  ■ rhT  .f/i+fe' 


(4.6-5) 


Reference  [4-6]  shows  how  to  use  the  correlogram  of  serial  correla- 
■;cn  coefficients  to  define  envelopes  of  data.  The  reference  shows  that 
.erial  correlation  preserves  periodicity.  The  reference  states  that 
'the  correlogram  'peaks  out’  on  the  positive  side  of  zero  whenever  the 
:nput  data  completes  a recognizable  period  of  information. " Tests  may 
He  applied  to  see  if  the  various  envelopes  are  statistically  different, 
they  are  not,  the  user  has  the  option  of  discarding  some.  For  details, 
he  reader  is  referred  to  reference  [4-6],  For  other  uses  of  the  serial 
^relation  see  Chapter  2. 


STATISTICAL  SAMPLING 

At  times  it  is  desirable  to  retain  only  a portion  of  the  data 
available.  The  retained  portion  is  called  a sample.  From  the  sample, 
inferences  can  be  made  about  the  collective  properties  of  all  the  data, 
tc  is  Important  to  choose  a sample  that  is  large  enough  for  valid 
inferences  to  be  made  and  yet  be  small  enough  to  meet  considerations  of 
time,  computer  storage  limitations,  ease  of  computation,  cost,  etc. 
Reference  [4-9],  pp  1-3,  states,  "Statistical  inferences  are  basically 
pr, dictions  of  what  would  be  found  to  be  the  case  if  the  parent  popula- 
tions could  be  and  were  fully  analyzed  with  respect  to  the  relevant 
characteristic  or  characteristics." 
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In  order  to  draw  correct  inferences,  the  method  by  which  a sample 
was  chosen  oust  be  known.  There  are  two  general  types  of  sampling: 
judgemental  and  chance.  Samples  selected  by  a chance  method  are  called 
probability  samples.  If  all  the  elements  of  a population  have  an  equal 
chance  of  being  selected,  the  sample  is  called  a random  sample.  This  is 
a necessary  condition  but  is  not  sufficient  for  a sample  to  be  a random 
sample.  A sufficient  condition  for  a sample  to  be  random  is  that  each 
possible  sample  must  have  an  equal  chance  of  being  selected.  Reference 
{4-9]  notes,  "experience  teaches  that  l.t  is  not  safe  to  assume  that  a 
sample  selected  haphazardly,  without  any  conscious  plan,  can  be  regarded 
as  if  it  had  been  obtained  by  simple  fandom  sampling.  Nor  does  it  seem 
possible  to  consciously  draw  a sample  at  random."  The  statistical 
techniques  in  this  chapter  are  applicable  to  random  samples  and  may  or 
may  not  be  applicable  to  other  types  of  sampling. 

One  example  of  random  sampling  occurs  when  there  is  a block  of  data 
consisting  of  N points.  A random  sample  may  be  obtained  by  assigning  a 
number  to  each  of  the  N values;  then  by  using  a random  number  generator,  or 
random  number  table,  to  list  a number,  equal  to  the  sample  size,  of 
different  random  numbers  less  than  N.  Select  from  the  list  of  points 
only  those  whose  position  on  the  list  corresponds  to  the  random  numbers. 

Another  example  is  the  case  when  the  data  may  be  known  to  have 
occurred  at  different  times.  Suppose  it  is  desired  to  estimate  the 
turnaround  time  for  jobs  sent  to  a computer.  Jobs  sent  to  the  computer 
are  given  a number  which  corresponds  to  the  day,  hour,  and  minute  at 
which  they  are  received.  The  same  information  is  recorded  when  the  job 
is  finished.  A random  sample  may  be  chosen  by  considering  two  digit 
random  numbers  in  blocks  of  three.  The  first  group  will  correspond  to 
the  day  of  the  month,  the  next  to  the  hour  of  the  day,  and  the  final  to 
the  minute.  The  job  selected  would  be  the  job  received  closest  to  the 
random  number  and  not  previously  selected. 

Reference  [4-9]  gives  two  methods  of  determining  the  size  of  the 
sample  to  be  drawn  to  estimate  the  mean  of  a population.  It  also  lists. 
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one  method  of  determining  the  size  of  sample  needed  to  estimate  the 
standard  deviation  of  a population  to  within  a certain  percent  of  its 
true  value.  One  method  is  outlined  below.  For  more  details  the  reader 
is  referred  to  the  reference. 

Assume  it  is  desired  to  know  the  mean,  m,  of  a population  and  that 
one  is  willing  to  take  a risk,  a,  that  the  estimate  is  off  by  d or  more. 
What  size  sample  is  needed?  There  is  available  an  estimate,  s,  of  the 
population  standard  deviation  oased  on  v degrees  of  freedom. 

From  tables  of  the  Student-t  distribution,  locate  t**tj_a  v for  v 
degrees  of  freedom,  lie  sample  size  is  then  computed  f .on  the  formula 

2 2 

n - (A. 7-1) 

r 

The  %ralje  to  use  should  be  the  smallest  integer  la-ger  than  or  equal  to 
n.  If  the  mean,  X,  of  a sample  of  size,  n,  is  computed,  then  with  loco- 
ed confidence,  it  can  be  said  the  interval  from  X-d  to  X+d  includes 
the  population  mean,  m. 

i.8  ANALYSIS  OF  VARIANCE 

Analysis  of  Variance  is  a technique  used  to  separate  variation  in 
data  into  source  components.  The  sources  of  variation  considered  in  the 
Analysis  of  Variance  are  called  variables  or  factors.  The  analysis  of 
the  variation  depends  on  the  particular  grouping  of  the  data  or  test 
design.  An  example  of  an  analysis  of  variance  procedure  was  shown  in 
paragraph  A. 6 of  this  chapter.  That  paragraph  discussed  the  procedure 
to  use  to  determine  whether  certain  coefficients  of  a regression  line 
were  significant.  Because  of  the  large  number  of  different  applications, 
the  reader  is  referred  to  the  references  for  the  particular  technique  to 
use  in  his  application.  References  (4-9]  and  [4-10]  give  exasq>les  and 
work  sheets  to  describe  the  various  processes.  Many  of  the  books  listed 
as  references  also  describe  work  sheets  and  give  examples. 
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4.9  SUMMARY 


This  chapter  provided  some  statistical  techniques  which  will  allow 
a user  to  eliminate  amounts  of  data.  Everything  described  has  been 
available  for  some  time.  The  techniques  may  be  termed  merely  classical 
statistics.  Paragraph  4.2,  which  describes  parametric  estimation, 
mentions  individual  values  which  may  be  used  to  replace  large  groups  of 
data.  Paragraph  4.6,  Correlation  and  Regression  Analysis,  gives  techniques 
which  enable  the  user  to  replace  a large  group  of  data  with  coefficients 
of  a function  or  to  eliminate  one  of  two  groups  of  data  and  replace  it 
with  a linear  function  which  relates  the  remaining  group  to  the  one 
eliminated.  Paragraph  4.7,  Statistical  Sampling,  is  presented  because  a 
smaller  random  sample  may  be  taken  from  a larger  group  and  allow  infer- 
ences to  be  drawn  about  the  collective  properties  of  the  larger  group. 
Equation  (4.7-1)  shows  how  to  compute  the  size  sample  to  select  if  one 
desires  to  know  the  mean  to  within  a given  amount  of  uncertainty.  Para- 
graph 4.8,  Analysis  of  Variance,  merely  gives  a definition  and  refers 
the  reader  to  source  documents. 
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CHAPTER  5 

OPTIMAL  ESTIMATION  TECHNIQUES 


5. 1 INTRODUCTION 

It  is  Che  intent  of  this  chapter  to  consider  data  compression  in 
relation  to  applied  optiaal  estimation.  In  particular,  this  chapter 
will  look  at  the  implications  of  the  use  of  such  techniques  in  conjunc- 
tion with  discrete  Kalman  Filters.  Starting  with  a statement  of  the 
discrete  filtering  problem,  the  compression  problem  will  be  set  up  and 
the  objectives  of  its  utilization  discussed.  For  the  most  part,  this 
chapter  represents  a survey  of  the  use  of  data  compression  techniques  in 
the  area  of  applied  recursive  optimal  estimation.  It  is  not  intended  to 
be  a theoretical  treatise  but  rather  a more  practical  approach  oriented 
to  problem  solving.  Both  optimal  and  subcptiaal  compression  techniques 
will  be  introduced  along  with  a discussion  of  techniques  for  evaluating 
the  suboptimal  types. 

"Optiaal"  data  compression  means  that  the  data  compression  and 
corresponding  estimation  are  performed  in  such  a way  as  to  minimize  some 
selected  measure  of  error  and  to  utilize  ail  information  concerning  the 
system  dynamics,  noise  statistics  and  initial  conditions.  The  optimal 
algorithms  presented  here  calculate  unbiased,  minimum  variance  estimates 
and  say,  under  certain  conditions  such  as  Gaussian  error  probability 
density  functions,  be  optimal  in  several  other  senses  such  as  least- 
squares,  maximum  likelihood,  Bayesian  et  al. 

An  attempt  has  been  made  to  include  guidelines  on  such  matters  as 
compression  design  and  recommended  filtering  and  sampling  rates.  General- 
ized matrix  forms  and  algorithms  will  be  presented  to  the  extent  possible 
and  a simple  but  illustrative  scalar  example  will  be  carried  throughout 
the  section. 
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Preceding  page  blank 


First  consider  the  basic  linear  discrete  aodel  for  which  a multi- 
stage  recursive  data  compression  and  estimation  algorithm  is  to  be 
constructed.  The  system  is  governed  by  the  following  equations: 


x(k+l)  * 4(k+l,k)x(k)  + v(k) 


(5.1-1) 


E[w(k)]  = 0 


(5.1-2) 


E[w(j)w*(k)l  “ Q(k)  6 


(5.1-3) 


where  £,  the  state  vector,  is  propagated  linearly  by  a transitition 
matrix  v,  and  the  state  is  corrupted  by  a zero-mean  white  process  noise 
w,  with  covariance  Q.  The  observation  equations  are: 


£(k)  * H(k)x(k)  + v(k) 


(5.1-4) 


Efv(k)]  - 0 


(5.1-5) 


tfv(j)vT(k)]  , R(k)5 

r« 


(5.1-6) 


Tne  observations  £ are  linearly  related  to  the  state  vector  by  the 
observation  matrix  H and  are  corrupted  by  zero-mean  white  noise  with 
covariance  R.  In  addition,  the  plant  and  observation  errors  are  uncorre 
la ted;  i.e.. 


E[v(j)wT(k)l  - 0 


(5.1-7) 


The  various  assumptions,  such  as  linearity  and  independent  errors,  can 
be  (and  have  been)  removed  by  investigators  over  the  years  but  will  be 
retained  for  purposes  of  simplicity  and  clarity  in  this  treatment. 
Serial  correlation  of  observation  error  will  be  considered  later. 


The  optimal  recursive  estimation  algorithm  for  this  problem  is  well 
known  as  the  Kalman  Filter  and  was  first  published  by  Kalman  [5-1,2]  . 
The  estimation  error  at  time  t(j),  given  observations  through  time  t(k). 


6i 


€(j|10  * x(j|k)  - x(j) 


(5.1-8) 


where  x is  the  estimate  of  tlie  true  state  x.  The  state  error  covariance 
matrix  is  then  defined  as: 


i-(j|k)  - E[€(jjk)€T(jJk)3 


(5.1-9) 


The  Kalman  Filter  is  then  the  linear,  recursive  minima  variance  estimator 
for  the  above  problea.  It  is,  in  fact,  a set  of  rules  for  optimally 
combining  the  observations  with  a priori  estimates  of  the  state-given 
statistics  of  the  relevant  processes-  The  resulting  algorithm  - not 
derived  here  - is  usually  presented  as  a two-stage  calculation. 

Extrapolation  Stage 


State  x(kjk-l)  - ?<k,k-l)x(k-l jk-1) 


(5.1-10) 


Covariance  P(k|k-1)  - *(k,k-l)P(k-l |k-l) (k,  k-l)  + 

Q(k-l)  (5.1-11) 


Update  Stage 


G(k)  * P(k |k— 1)HT (k)[H(k)P(k|k-l)HT (k)  + 


(5.1-12) 


State 


x(kjk)  - x(k jk-1)  + G(k)[*(k)  - K(k)* 

(k jk-i)]  (5  i— 1 3) 


Covariance  P(kjk)  « [I  - G(k)H(k) jP(klk-l)  (5.1-14) 


The  implications  and  application  of  this  algorithm  are  beyond  the  scope 
of  this  treatment,  but  the  author  highly  recommends  Gelb  [5-3}  as  an 
excellent  reference  on  the  practical  aspects  of  Kalman  Filter  design. 


Example 


A scalar  exa'  ,.le  of  such  a filtering  problem  is  the  estimation 
of  a first  order  Markov  Process  with  exponential  correlation,  i.e., 


The  state  model  is  simply 


with  observations 


where 


E[x(t)x(t+T)  ] = az  exP  ( ”0iT) 

X 

(5.1-15) 

simply 

x(k+l)  = vx(k)  + w(k) 

(5.1-16) 

2(k)  - x^k)  v v(k) 

(5.1-17) 

*c ' 5+i  - ck 

(5.1-18) 

Y * exp  C-oaO 

(5.1-19) 

w — N(o,q) 

(5.1-20) 

v ~ N(o, r) 

(5.1-21) 

q * rf  (l-V^) 

(5.1-22) 

r --  _2 

t 

(5.1-23) 

The  corresponding  Kalman  Filter  for  this  problem  is  then: 
x(k |k-l)  * Yx(k-1  |v-l; 


k-l)  = vx(k-J l^-l)  (5.1-24) 

p(k|k-l)  * v2p(k-l|k-l)  -f  q (5.1-25) 

8(k)  = pfkjk-ll  /[p(k(k-l)  a3(k>]  (5.1-26) 


T 


i 


i 

* 


*(k|k)  - x(kjk-l)  + g(k)[z(k)  - x(k|k-l)]  (5. 1-27) 

P(k|k)  - [1  . g(k)]p(k|k-l)  - g(k)a2(k)  (5.1-28) 

Figure  5-1,  a computer  generated  Gaussian  white  noise  sequence,  was 

utilized  to  drive  equation  (5.1-16)  and  thus  simulate  a typical  Harkov 
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Process  of  this  type  using  values  of  Y “0*9  and  * 1*0.  The  same 
Gaussian  random  number  generator  was  utilized  to  generate  white  observa- 
tion errors  with  a * 0*5  resulting  in  the  simulated  observations  of  x - 
the  z's.  In  Figure  5-2  these  observations  were  Introduced  to  the  Kalman 
Filter.  The  resulting  estimation  errors,  e (after  update),  are  plotted 
along  with  the  associated  error  standard  derivation  o £,  calculated  by 
the  Kalman  Filter.  Notice  the  saw-tooth  pattern  of  a caused  by  the 
time  extrapolation  which  increases  0£  followed  by  the  update  which 
decreases  0£  because  of  the  addition  of  measurement  information. 

Reformulate  the  basic  recursive  estimation  problem  into  a multi- 
stage data  compression  and  estimation  problem.  Suppose,  as  shown  in 
Figure  5-3,  that  the  filter  is  cycled  once  every  NAfc  seconds  but  that  it 
is  desirable  to  process  data  at  a rate  N times  the  filter  cycling  rate. 
The  integer  N is  often  referred  to  as  the  compression  ratio.  Therefore 
at  time  t(k)  there  are  N measurements,  equally  spaced  ^t  apart,  that 
have  been  made  since  the  last  filter  cycle  at  time  t(k-N)  which  are  to 
be  processed  at  time  t(k).  This  problem  might  be  expected  when  the 
observation  data  are  available  at  a rate  higher  than  that  rate  which  can 
computationally  cycle  the  full  filter  or  that  rate  which  is  necessary  to 
recover  the  desired  signal  frequency.  If  the  additional  data  is  ignored, 
as  is  the  case  when  using  the  conventional  Kalman  Filter  since  it  accepts 
only  a single  observation,  much  useful  information  concerning  the 
signal  that  would  improve  the  accuracy  of  our  estimation  procedure  is 
discarded. 

The  objective  of  optimal  data  compression  techniques  is  to  combine 
the  N measurements  in  some  manner  into  a single  parameter  (or  set  of 
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FIGURE  5-2 

EXAMPLE*.  KALMAN  FILTER  ERROR 
STANDARD  DEVIATION  AND  ERRORS 


i 
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parameters)  in  such  a way  as  to  minimise  loss  of  accuracy  while  maintain- 
ing computational  efficiency.  The  procedure  which  operates  directly  on 
the  measurement  is  referred  to  as  the  "data  compressor"  or  "prefllter" 
and  that  which  operates  more  slowly  on  the  compressed  observation  is 
referred  to  as  simply  the  "filter"  or  "estimator."  A Kalman  Filter, 
such  as  described  previously,  operating  directly  on  the  measurements  at 
the  high-data  rate  and  which  contains  all  the  correct  model  information 
and  statistics  will  be  "optimal."  This  filter  represents  tua  best 
available  and  thus  is  chosen  as  the  standard  for  purposes  r'  performance 
comparisons.  The  primary  goal  is  to  design  a "suboptimal"  data  compress- 
ion technique  that  degrades  only  slightly  (or  within  acceptable  limits) 
from  the  optimal.  Besides  the  obvious  ad  ‘antage  of  computational  effi- 
ciency, data  compression  can  also  be  quite  useful  when  dealing  with 
multiple  data  rates  and  unevenly  spaced  data  if  an  acceptable  common 
estimation  cycle  time  to  which  the  data  might  be  reflected  (and  compressed) 
can  be  determined. 

Undoubtedly  the  best  overall  treatment  of  data  compression  and 
optimal  estimation  is  that  of  Joglekar  (5-4).  This  work  is  comprehensive, 
covering  optimal  batch- weighting  as  well  as  various  averaging  algorithms, 
covariance  evaluation  techniques  and  practical  guidelines  for  design  of 
multi-stage  compression/estimation  schemes.  This  work  wss  conducted  at 
the  Stanford  University  Guidance  and  Control  Laboratory  and  was  sponsored 
by  the  Air  Force  Avionics  Laboratory.  Womhle  [5-5,  6]  at  Georgia  Insti- 
tute of  Technology  derived  an  optimal  recursive  prefiltering  version  of 
the  Kalman  Filter  by  determining  a single  discrete  measurement  that  is 
equivalent  to  a set  of  discrete  measurements. 

Applications  of  various  data  compression  techniques  to  estimation 
type  problems  are,  of  course,  quite  numerous  sod  we  will  list  only  a 
select  few  here.  Bar-Shalos  [5-7]  dep's  with  the  compression  of  data  in 
real-time  nonlinear  estimation  problems  such  as  tb».  linearized  tracking 
filter  for  a re-entry  vehicle.  Clark  [5-8]  applied  data  compression 
techniques  in  the  design  of  a real-time,  dual-bandwidth,  adaptive  Kalman 
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cracking  filter  for  high-speed  Maneuvering  missiles  and  aircraft  in  a 
weapons  control  environment.  Warren  [5-9]  derived  a filter  which  provides 
optinal  compensation  for  time  lag  and  plant  observation  noise  correlation. 
He  applied  the  algorithm  to  position  and  velocity  estimation  for  aircraft 
navigation.  Kizner  [5-10]  utilized  Chebyshev  polynomial  fits  to  derive 
an  optimal  data  compression  which  he  claims  has  better  accuracy  than  the 
minimum  variance  estimate  without  data  compression. 

5.2  OPTIMAL  DATA  COMPRESSION  TECHNIQUES 

In  a sense, the  title  of  this  paragraph  might  appear  self-contradic- 
tory because,  in  application,  data  compression  is  never  implemented 
optimally.  If  it  is  desirable  to  optimally  process  all  the  data, 
merely  use  the  Kalman  Filter.  Optimal  data  compression  is  simply  a 
restructuring  of  the  Kalman  Filter  into  the  multi-stage  problem  of 
Figure  5-3.  The  restructuring  is  constrained  such  that  the  error  covar- 
iance at  the  end  of  each  multistage  is  equal  to  that  of  the  optimal. 

The  reason  for  doing  this  is  to  see  the  optimal  data  compressor  and  thus 
determine  exactly  what  terms  are  neglected  and  test  the  validity  of 
these  simplifying  assumptions. 

Optimal  data  compression  is  a very  important  tool  for  designing 
such  a system.  Both  Notable' s optimal  recursive  prefilter  and  Joglekar's 
batch  optimal  compression  algorithm  will  be  presented,  since,  for  any 
particular  application  and  computer,  one  fora  may  be  preferable  over  the 
other.  Both  algorithms  are  optimal  in  the  minimum  variance  sense  and 
are  exactly  equivalent  in  covariance  at  the  end  of  the  compression 
intervals  to  the  fast  cycling  conventional  Kalman  Filter. 

The  recursive  prefiltering  algorithm  of  Womble  [5-5,  6]  is  presented 
in  Table  5-1.  It  consists  of  a set  of  recursive  matrix  equations  for 
the  prefilter  which  must  be  cycled  N times  before  the  state  and  error 
covariance  are  updated  by  the  estimator  at  the  end  of  the  interval.  The 
prefilter  can  be  cycled  either  as  the  measurements  occur  or  delayed 
until  the  end  of  the  interval  and  processed  as  a batch. 
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TABLE  5-1  OPTIMAL  RECURSIVE  DATA  COMPRESSION  ALGORITHM 


Comnregsion  For  i = 1,  H * k-H,k 


*(0  = HT(i)R-*(i)z(i)  (0 

J(i)  = HT(i)r‘(i)H(t)  (*) 

A 1 (i)  - *(i  ,i-1  )A(£— 1 )*T(i  ,»-l ) + Q (i)  (3*0 

B(i)  = I + J(f)A'(t)  (3b) 

C(i)  = I + A'(i)j(0  (3c) 

A(«)  = A'  (t)B_1(t)  (3d) 

6'(t)  = ♦(i,i-l)B(i-l)  0*) 

«(t)  = [I  - A(t)j(t)]^(i)  ♦ A(i)s(t)  Ub) 

?(0  * C“‘C*)*(£,i-l)S  i-1)  (5) 

J (i ) = J(i-1 ) ♦ *T(i  )J(i  }C(i  )9(i ) (6) 

1(i)  = f(i.  1)  ♦ «*(*  )[■(!)  - j{i)*(OJ  (7) 

A(o)  = J(o)  = o 

l(o)  = I — Initialisation 

i(o)  s $( o)  = o 


Eati'aatiea 

P*(k-N)  = [I  ♦ p{k-lf| k-M) J(W)j-x  P(k-K| k-N)  (8») 

^ (k-lf)  = [I  - P'(k-H)j(20]$(k-Nik-«i)  ♦ P(k-H)f(H)  (9*0 

i(k|k)  * J(N)*'  (Jc-N)  + S{n)  (9b) 

P(kj  k)  = l(N)P  (k-Nj1T(N)  + A(R)  (8b) 
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Joglekar  s [5-4]  algorithm  for  optimal  weighting  of  batch  measure- 
mencs  is  shown  in  Table  5-2.  This  algorithm  is  "cleaner"  than  the 
recursive  algorithm  in  that  the  aatrix  algebra  equations  are  not  partic- 
ularly more  complicated  than  the  original  Kalman  Filter.  In  fact,  it  is 
rather  easy  to  see  that  the  Kalman  Filter  for  the  trivial  case  of  N-l  is 
recovered.  This  appearance  of  simplicity  is  misleading  if  the  dimensions 
of  the  matrices  used  in  the  calculations  are  examined  closely.  The  R* 
matrix,  in  particular,  can  get  quite  large  - (MS  x MN)  where  M is  the 
dimension  of  the  single  observation.  Unfortunately,  it  is  necessary  to 
invert  this  matrix. 

In  Tables  5-3  and  5-4,  the  recursive  and  batch  optimal  data  compres- 
sion algorithms  were  applied  to  the  selected  example  problem  presented 
previously.  The  substitution  is  rather  straightforward.  The  resultant 
algorithms  were  applied  with  exactly  the  same  set  of  parameters  and 
observations  used  previously.  The  results,  using  a data  compression 
ratio  of  H*5,  are  presented  in  Figures  5-4  and  5-5.  Although  each  of  the 
algorithms  have  different  processing  and  covariance  histories,  it  is 
important  to  emphasize  that  at  the  end  of  each  compression  interval; 
that  is  k-5  and  10,  the  error  variances  (or  standard  deviations)  and 
actual  estimates  are  identical  to  the  original  Kalman  Filter  presented 
in  Figure  5-2.  The  optimal  data  co^iression  algorithms  are,  in  fact, 
merely  the  optimal  Kalman  Filter  rearranged  to  account  for  the  time 
delays  and  lumping,  etc.t  occurring  with  the  data  compression  approach, 
the  principal  difference  in  the  error  standard  deviation  histories  of 
Figure  5-4  and  5-5  are  caused  merely  by  the  order  in  which  the  extrapola- 
tion and  update  steps  are  taken.  The  recursive  compressor  reverses  the 
more  conventional  order  and  updates  before  extrapolating. 

Examination  of  either  algorithm  reveals  a very  significant  problem 
that  has  not  been  discussed  yet  but  which,  in  certain  ci rcuaatances  can 
render  data  compression  implementations  either  computationally  impractical 
or  seriously  degraded  in  terms  of  performance.  Since  it  is  necessary  to 
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TABLE  5-3  EXAMPLE:  RECURSIVE  COMPRESSOR 
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L. 


Compressor  - 

For  t - k-H 
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a no 
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Table  5-4  Example:  Batch  Compressor  (continued) 


Gain 


>v' « * -,V  * <■>] 
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P'(k)  + tf 
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*»j  = 1,N  (II) 
(12) 


* = M (15) 
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pU)  = ?' (k) 
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FIGURE  5-4 


EXAMPLE:  RECURSIVE  COMPRESSOR  ERROR 
STANDARD  DEVIATION  AND  ERROR 


7fi 


FIGURE  5-5 

cxample:  batch  compressor  error 

STANDARD  DEVIATION  AND  ERROR 


refer  all  observations  to  a common  time  and  this  movement  in  time  obeys 
the  state  dynamics  equation,  additional  error  Is  introduced  due  to  the 
presence  of  plant  process  noise.  In  fact,  the  compressed  observation 
error  becomes  correlated  with  the  plant  noise  even  though  the  original 
problem  contained  no  such  correlation.  We  see  this  correlation,  for 
example,  in  the  expression  for  v*,  equation  (12)  in  Table  5-2,  and  in 
the  resulting  equations  In  both  algorithms.  Consideration  of  this 
correlation  was  taken  in  the  derivation  of  both  optimal  compression 
schemes.  The  presence  of  this  effect  results  in  the  major  addition  of 
complexity  to  the  data  compression  algorithms  over  that  of  the  original 
Kalman  Filter.  The  condition  for  neglecting  this  effect  and  the  tremen- 
dous simpliciation  that  results  is  presented  in  paragraph  5.3. 

There  is  another  rather  obvious  approach  to  the  optimal  data  compres- 
sion problem  that  is  somewhat  simpler  and  should  not  be  overlooked. 

This  approach  is  to  simply  let  the  data  compressor  be  a Kalman  Filter 
with  i*=l  the  first  point  and  i*N  the  last.  If  the  output  state  is 
evaluated  at  a time  other  than  i*N,  simply  utilize  optimal  Kalman  smooth- 
ing such  as  discussed  by  Gelb  [5-3] . The  output  state  vector  from  the 
compressor  then  becomes  the  input  compressed  observation  for  the  slow 
Kalman  Filter,  t,-  conditions  are  such  (no  process  noise)  to  insure  that 
the  compressor  estimate  is  totally  independent  of  the  slow  filter  and 
thus  represents  a new  uncorrelated  observation,  the  resulting  combination 
of  Kalman  Filters  will  be  equivalent  to  a single  fast  filter  and  will 
thus  be  optimal.  This  approach  is  favoured  under  these  circumstances 
since  it  lends  itself  to  analysis,  implementation  and  evaluation  easier 
than  the  other  two  approaches.  If  process  noise  is  present,  a conven- 
tional Kalman  Filter  which  accounts  for  observation  and  plan  correlation 
can  be  utilized.  An  example  of  such  an  algorithm  can  be  found  in  Sage 
and  Melsa  [5-11] . 

5.3  SUBOPTIMAL  DATA  COMPRESSION 

'r  s paragraph  will  show  how  practical  suboptiaal  data  compression 


78 


algorithms  can  be  derived  from  the  optimal  algorithms  contained  in  the 
previous  paragraphs.  Essentially  the  arguments  of  Womble  [5-5]  will  be 
reproduced  and  the  problem  progressively  si^»lified  by  adding  particular 
constraints  to  the  original  problem  definition. 

5.3.1  NEGLIGIBLE  PROCESS  NOISE 

The  greatest  simplification  that  can  be  made  to  the  data 
compression  problem  occurs  when  there  is  no  process  noise;  i.e.,  Q*0. 

If  the  recursive  algorithm  of  Table  5-1  is  considered  first,  it  is  found 
that,  using  the  initial  conditions  and  letting  Q s 0;  X ® 0 and  0*0. 

As  Wr.mble  points  out,  the  prefilter  transition  matrix  becomes  the  usual 

value 


*(i)  - #(i,  l)  (5.3-1) 

ar.4  the  prefilter  equations  reduce  to 


~(N> 


i=N 

£ *T(i,l)J(i>*(M) 

i-1 


(5.3-2) 


and 


z'OO 


i*N 

V #T  (i,  l)m(i) 


(5.3-3) 
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This  approach  essentially  results  in  the  compressed  observation  being  a 
rather  simple  -weighted  average  which  shows  considerable  computational 
advantage  over  the  original  Ea.Laan  Filtar.  Similarly,  the  batch  algorithm 
simplifies  since,  for  Q * 0,  Q*  * 0,  T*  * 0 and  R*  reduces  to  a much 
simpler  matrix  involving  only  the  original  R matrices.  It  begins  to 
become  obvious  that,  in  fact,  the  two  algorithms  actually  end  up  process- 
ing the  observations  identically  as  stated  previously.  Joglekar  points 
cut  that,  rather  than  having  Q vanish,  Q should  be  negligible  relative 
to  the  observation  error;  i.e., 

|jH(i)s(i,i+l)Q(i)4T(i,i+i)HT(i)|j  «||R(i)||  (5.3-4) 

where  the  double  brackets  denote  the  matrix  norm. 

Example:  For  the  simple  Markov  example,  equation  (5.3-4)  reduces  to 

a s(l-^)  «ct  (5.3-5) 

Y * » 

2 

Therefore,  it  is  reasonable  to  expect  to  invoke  this  assumption  if  o 

2 x 

<<  o“  when  the  process  noise  shows  little  variance  relative  to  the 

V 

observation  noise.  (The  process  begins  to  look  like  a constant  zero.) 

Also  for  the  limiting  cases  y-1  and  y~*0,  the  process  looks  like  a 
constant  bias  or  simply  white  noise  like  the  observations.  Of  course, 
this  last  case  makes  the  entire  attempt  of  estimation  ridiculous. 
Numerically,  this  particular  example  corresponds  to  (0*09)  « 1. 

5.3.2  NEGLIGIBLE  SYSTEM  DYNAMICS 

If  over  the  compression  interval,  the  system  dynamics 
appear  to  be  $ = I, additional  simplification  of  the  algorithms 
result.  This  condition  means  that  it  looks  like  the  measurements  o, -cur- 
at the  same  time  or  that  no  "reasonably  accurate"  dynamics  can  be  resolved 


SO 


over  the  interval  due  to  observation  error.  In  the  recursive  algoritha 
we  find 


(5.3-6) 


(5.3-7) 


In  the  batch  algoritha,  the  H*  Matrix  simplifies  considerably. 


5.3.3  STATIONARY  OBSERVATION  STATISTICS 


Also,  if  over  the  compression  interval,  the  observation 
statistics  do  not  change;  i.e.,  R(i)  * R for  all  1,  the  recursive  algo- 
ritha  looks  like  the  following: 


J(N)  * H 


'&-] 


(5.3-8) 


~w  * h'[w  *] ( » ) £>> 


(5.3-9) 


Joglekar  [5-4]  was  motivated  by  the  appearance  of  the 
weighted  average  measurement  compression  to  construct  "exact  averaging" 
algorithms  which  are  designed  to  give  the  best  estimate  of  the  state 
given  that  only  equally  weighted  averaged  measurements  are  available. 

He  obtaine-.  an  expression  for  the  information  loss  due  to  "exact  averag- 
ing." Hi'  exact  averaging  algorithms  included  consideration  of  process 
noise  and  o?;sociated  correlations.  The  development  is  quite  lengthy  and 
will  not  be  repeated  here. 

5.4  SENSITIVE TY  ANALYSIS 

A particularly  significant  advantage  for  developing  optimal  data 
compression  algorithms  is  that  they  provide  a performance  standard  for 
comparison  and  evaluation  of  suboptimal  realizeable  approaches.  As 
shown  in  paragraph  5.3,  it  is  possible  to  determine  exactly  those  terms 
that  were  chosen  to  be  neglected  awl  check  the  validity  of  the  assumption, 
'’nfortur.ately,  when  suboptimal,  the  associated  error  covariance  calcula- 
tions are  incorrect  since  they  are  based  upon  simplifying  assumptions. 
Therefore,  the  calculated  suboptimal  error  covariance  can  no  longer  be 
used  as  a true  measure  of  estimation  performance.  Fortunately,  however, 
optimal  estimation  theory  comes  to  the  rescue  by  providing  a means  to 
calculate  the  actual  error  covariance  of  a suboptimal  implementation  and 
thus  compare  it  with  the  optimal  to  determine  the  level  of  performance 
degradation.  Again,  Gelb*s  book  [5-3]  provides  an  excellent  discussion 
of  suboptimal  filter  design  and  sensitivity  analysis. 

In  order  to  calculate  the  actual  covariance  of  a suboptimal  design. 

It  is  necessary  to  build  a sensitivity  algorithm  tailored  to  fit  the 
original  compression  approach.  Therefore  each  wf  the  three  optimal 
algor i thas  in  paragraph  5.2  must  have  their  own  associated  sensitivity 
algorithm.  In  his  report,  Joglekar  [5-4]  derives  equations  for  the 
actual  covariance  when  using  the  averaging  type  compression  algorithms 
he  derived.  The  author  does  not,  however,  provide  a sensitivity  algorithm 
tor  the  general  batch  compressor.  Womble  [5-5,  6]  also  falls  to  provide 


a sensitivity  algorithm  for  the  prefilter.  Sensitivity  is  therefore 
clearly  an  area  of  data  c Depression  requiring  additional  work  if  designers 
are  to  have  a complete  set  of  tools  with  which  to  develop  practical  data 
compression  algorithms. 

5.5  GUIDELINES  FOR  OPTIMAL  DATA  COMPRESSION  DESIGN 
5.5.1  ESTIMATION  KATE  AND  SHANNON'S  THEOREM 

The  first  question  to  be  considered  involves  how  frequently 
to  estimate  the  state  of  the  system  to  specify  accurately  the  state  at 
all  times.  Shannon's  Theorem  - found  in  Monroe  [5-12]  - says  that  if  a 
signal  is  bandlimited  and  contains  no  frequency  greater  than  o)si^na^ 
(radians/  second)  then  it  is  possible,  in  principle,  to  recover  completely 
the  original  signal  from  the  sampled  signal  if  sampled  at  a minimum  rate 


Signal  A 88COnd 


(5.5-1) 


This  is  to  say  in  theory,  no  information  is  lost  if  the  signal  is 
perfectly  sampled  at  that  rate  or  faster.  Since  it  is  desirable  to 
reconstruct  the  signal  as  accurately  as  possible  and  with  a minimum  loss 
of  Information,  cycle  the  signal  estimator  (or  slow  filter)  no  slower 
than  Qg.  In  fact,  since  there  are  no  perfect  samples  or  perfect  estima- 
tors, estimate  even  faster  than  Qs  - perhaps  by  a factor  of  two  to  ten. 
Another  problem  is  that  real-world  signals  are  not  often  truly  bandlimited 
but  often  only  an  accurate  estimate  of  the  lower  frequency  components  is 
of  interest.  Shannon  can  still  be  used  as  a guideline  to  select  the 
estimation  rate  but  consideration  must  also  be  given  to  the  affects  of 
the  higher  frequency. 

5.5,2  SAMPLING  RATE  AND  THE  HYQUIST  FREQUENCY 


The  Nyquist  Frequency  or  folding  frequency  is  defined  by 
Benda t and  Piersol  [5-13]  as 


% “ * Q.a^iiag(r^lw/*ccond>  (5.5-2) 

The  similarity  to  equation  (5.5-i)  is  undoubtedly  not  coincidental.  If 
there  is  any  frequency  component  in  the  signal  - be  it  due  to  observation 
error  or  plant  noise  - there  will  bs  confusion  between  the  higher  freq- 
uency components  and  the  lower  frequency  components  that  are  presumably 
of  interest.  This  problem  is  well  known  as  aliasing  or  the  "folding*’  of 
high  frequency  components  into  the  low  frequency.  This  is  inherent  in 
all  analog- to-digital  sampling  systems.  There  are  two  practical  methods 
of  handling  the  aliasing  problem.  The  first  is  to  simply  raise  by 
raising  the  sampling  frequency  until  there  is  no  frequency  component 
above  This  technique  is  not  always  practical  however.  The  second 

and  more  efficient  method  is  to  simply  analog  filter  the  data  prior  to 
sampling  or  digitally  prefilter  the  data  by  simply  averaging  it  in 
batches  as  in  data  compression.  The  analog  and  digital  prefilters  are 
in  fact  complimentary;  the  analog  being  prefer: id  to  remove  very  high 
frequency  noise  (r* lative  to  the  signal)  and  the  digital  to  remove  noise 
which  is  not  so  high  compared  to  the  signal.  Joglekar  [5-4]  discusses 
this  in  greater  detail  in  his  paper. 

5.5.3  SERIALLY  CORRELATED  OBSERVATION  ERROR 

If  the  observation  error  has  a bandliraited  serial  correla- 
tion, either  naturally  or  due  to  the  pref liter ing,  the  effects  on  the 
Information  content  of  the  observations  as  a function  of  the  data  rate 
and  correlation  should  be  considered.  As  an  example,  follow  the  arguments 
of  Clark  [5-8]  and  consider  exponentially  correlated  observation  error 
where  the  correlation  coefficient  of  the  original  data  is  given  as 


i(T)  - E[v(t)v(t+T) ]/ E[v^(t)3 
• exp  <-|T|/t„ 


(5.5-3) 


where  T is  the  correlation  time  constant.  The  discrete  noise  propaga- 
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tioo  equation  for  this  stationary  case  is  then 


v(k)  - p(At)v(k-l)  +3^^)  |(k-l) 


which  is  a simple  linear  systen  driven  by  white  noise  of  variance  ar 

2 ’ 
related  to  the  ouput  variance  by  the  relation 


" ffv  - oa  (At) 


(5.5-5) 


Now  assume  that  N measurements  are  again  to  be  compressed  utilizing  an 

averaging  technique  to  yield  a single  compressed  observation.  The 

variance  and  correlation  time  of  the  compressed  measurement  as  a function 

of  the  original  statistics  and  the  compression  ratio  should  now  be 

determined.  The  compressed  observation  error  v is  given  simply  as  the 

c 

average 
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N ) v(k-lfH) 


(5.5-6) 


By  substituting  this  into  the  appropriate  definitions  and  taking  expected 
values,  it  is  easy  to  show. 


■A-jr 


(5.5-7) 


where 


*1  * H* 


If 


M\ 


(5.5-8) 


which  can  be  simplified  to 


(5.5-9) 
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The  effective  correlation  time  of  the  compressed  observation  is 


related  to  the  original  x by 


'c  _ K In 


in  (l/oc) 


(5.5-10) 


where 


Pc  * S2  /S! 


(5.5-11) 
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In  Figures  5-6  and  5-7  the  ratios  are  plotted  as  a function  of  the 
compression  ratio  for  various  levels  of  relative  correlation.  Large 
values  of  At/t  imply*  less  correlation  than  small  values.  In  Figure  5-6 
we  find  (as  we  might  expect)  that  for  essentially  uncorrelated  error 
(at/Tv  * 10)  the  error  reduction  behaves  ideally  as  1,Vk.  As  the  correla- 
tion increases,  the  less  Independent  information  is  received  and  improve- 
ment diminishes. 


In  fact,  for  high  correlation  (At/iy  0*1),  little  improvement  is  observed 
even  after  20  samples.  Figure  5-7,  shows  that  for  conditions  of  high 
correlation,  the  data  compression  process  does  not  significantly  increase 
the  b3SlC  correlati°n  tiae.  However,  a dramatic  increase  of  correlation 
time  is  realized  by  compressing  observations  that  originally  contained 
little  correlation.  Joglekar  [5-4]  recommends  a sampling  rate  such  that 

0-25  <:  At h s 1.0 

V 51 

in  order  to  efficiently  recover  most  of  the  information. 
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CHAPTER  6 

MAXIMIZATION  OF  INFORMATION  CONTENT 


6.1  INTRODUCTION 

Sometimes  it  is  necessary  to  restore,  in  total,  exceedingly  large 
amounts  of  data  that  have  been  collected.  This  is  especially  true  of 
projects  where  data  is  collected  by  one  responsible  agency,  stored  and 
retrieved  by  another  agency,  and  used  by  several  different  agencies  fur 
different  purposes. 

For  example,  live  aircraft  test  data  collected  under  varing  environ- 
ments may  be  desired  by  agencies  interested  in  missile  simulations, 
others  interested  in  aircraft  performance,  still  others  interested  in 
instrumentation  accuracies,  etc.  Often  the  storage  of  such  data  ir 
referred  to  as  a "Data  Base"  or  a "Data  Bank."  The  designer  of  such  a 
system  encounters  problems  that  do  not  normally  arise  when  smaller 
amounts  of  data  are  involved. 

It  is  not  unusual  for  such  data  bases  to  contain  several  million  to 
a billion  or  more  words  of  data.  The  cost  involved  in  the  storage  and 
retrieval  of  such  data  can  be  prohibitive  if  careful  planning  is  not 
made  in  the  design  phases. 

The  purpose  of  this  chapter  is  i.c  suggest  poetical  by  which 

the  sheer  volume  of  the  data  can  be  reduced  xf  tradeoffs  in  accuracy  ani 
retrieval  costs  can  be  accepted.  Hopefully,  this  will  give  the  des:gner 
a starting  point  when  faced  with  a large  volume  of  data  to  be  stored  and 
retrieved.  Additionally,  suggested  ways  for  presenting  the  large  amounts 
of  data  to  the  user  will  be  discussed  with  a few  general  purpose  graphic 
routines  presented  in  paragraph  6.4. 
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Preceding  page  blank 


6.2  VOLUME  REDUCTION 


For  the  purpose  of  discusion,  consider  the  following  example. 

A particular  project  expects  to  fly  500  air  - air  combat  training 
missions.  It  is  desired  to  retain  from  each  aircraft,  in  time-history 
form,  the  following  parameters  for  future  investigation. 


Description 


No.  Parameters 


Time  1 
Position  3 
Velocity  3 
Accelerat ion  3 
Attitude  3 
Angle  of  Attitude/ 

Angle  of  Side  Slip  2 
Aiming  Parameters  3 
Aspect  Parameters  5 
Target  ID  1 
Power  Setting  1 
Fire  Signal  l 
Relative  Winds  3> 

Total  27 


If  tcyr  aircraft  participate  for  an  average  of  30  minutes  par 
mission  and  the  collection  scheme  is  10  sasples/sec,  the  total  number  of 
data  vot  * collected  would  be  27  X 4 X 10  X 60  X 30  S 500  « 972  X I06 
words.  Wit n presenc  storage  devices,  the  cost  of  storage  and  retrieval 
would  hr  prohibitive  unless  the  volume  could  be  reduced. 

A fxt:{,r.  step  in  approaching  the  problem  should  be  to  investigate 
other  metnc.d8  i*.ew**r i in  previous  chapters  of  this  document  for  reducing 
the  number  of  words  that  <uust  be  stored.  For  example,  it  may  not  be 
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necessary  to  retain  10  samples/sec  for  every  parameter.  Using  the  sampling 
techniques  discussed  in  Chapter  2,  alternate  sampling  rates  can  be 
derived  which  may  reduce  the  total  number  of  words  by  a factor  of  two 
or  more. 

Parameters  such  as  fire  signals,  power  settings  and  target  identifi- 
cation change  relatively  few  times  during  a given  mission.  These  can  be 
retained  on  a separate  file,  recording  only  the  change  and  time  of 
change. 

6.2.3  RECOMPUTING 

Investigation  should  be  made  into  the  need  for  retaining 
every  parameter.  Could  some  parameters  be  computed  from  others  at 
retrieval  time  with  acceptable  computer  costs? 

In  the  example  given,  velocity  and  acceleration  can  be 
computed  from  position.  Aiming  and  aspect  angles  can  be  derived  from 
position  and  attitude.  Inertial  angle  of  attack/angle  of  side  slip 
can  be  computed  from  relative  winds,  velocity,  and  attitude  data. 

Relative  winds  can  be  derived  from  wind  tables  stored  in  a different 
file.  Assuming  that  target  ID,  power  set. iug,  fire  signals,  and  wind 
tables  are  stored  on  separate  files  (the  magnitude  of  these  files  would 
be  relatively  small  in  comparison)  and  the  pa? ameters  mentioned  above 
can  be  recomputed,  the  number  of  words/sample  becomes  6 instead  of  27. 

The  reduction  f actor  as d , 8: 1 . 

6.2.2  SCALING  AND  PACKING 


Scaling  a parameter  simply  means  determining  the  absolute 
resolution  that  must  be  maintained  when  the  data  is  retrieved.  It  is 
important  because  the  resolution  determines  the  minimum  number  of  bits 
necessa*y  to  retain  the  parameter. 
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As suae,  in  the  example  given,  that  a stored  position 
resolution  of  1 foot  with  angular  resolution  of  .1  degree  is  sufficient 
to  retain  the  necessary  accuracy  when  the  data  is  reproduced.  If  the 
missions  are  to  be  flown  in  an  airspace  with  a diameter  of  50  miles, 
then  the  dynamic  range  of  a position  word  is  ±264000  ft.  The  number  of 
bits  necessary  to  represent  a positional  parameter  to  l foot  resolution 
is  20  bits.  The  compression  ratio  Cor  a CDC  6600  computer  word  is  3:1. 

For  the  32-bit  word  machines  the  ratio  is  only  1.6:1. 

Additional  compression  may  be  realized  by  making  use  of 
the  fact  that  the  dynamic  range  of  the  first  difference  in  position  is 
usually  within  ±200GxAt  where  At  is  the  sampling  interval  in  seconds. 

I:  \t  = .2,  the  first  difference  lies  between  ±400  which  can  be  retained 
in  iO  bits.  If  only  the  magnitude  of  the  first  differences  were  retained, 

9 nits  would  suffice. 

In  order  to  retain  1 foot  resolution,  it  is  necessary  to 
periodically  record  the  full  position  word  with  intermediate  updating  of 
position  from  the  first  difference.  If  the  first  differences  are  retained 
to  .1  ft  resolution  with  rounding,  the  maximum  error  contributed  by  a 
j’ngle  sample  is  .05  ft.  Assuming  that  uniform  distribution  of  error  is 
between  Q - .05,  the  average  error  contributed  would  be  .025  ft/sample, 
if  the  retained  sample  race  is  5/sec  and  the  full  position  is  recorded 
i very  4 seconds,  average  cumulative  error  would  be  approximately  .5  ft- 
Crhe  20th  periodic  samples  would  be  the  updated  position).  The  additional 
compression  realized  by  this  scheme  would  be  (20X20:20+20X10)  * 400:220*2:1. 

6.2.2.  1 EXPONENTIAL  PACKING 

Sometimes,  as  in  the  case  of  radiometric  data, 
the  dynamic  range  of  a variable  is  extremely  large.  Additionally  the 
rate  of  change  can  be  of  such  magnitude  as  to  preclude  using  the 
first-difference  technique  described  previously.  Usually  in  such  cases, 
absolute  accuracy  is  not  required.  Instead  a given  number  of  significant 
digits  of  accuracy  would  be  sufficient.  Using  this  criteria,  an  exponen- 
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tial  packing  scheme  could  be  devised  such  that  the  exponent  of  a variable 
could  be  retained  in  a few  bits  with  the  normalized  (leading  zeroes 
suppressed)  variables  presented  to  the  desired  number  of  significant 
digits. 

This  scheme  can  be  very  useful  if  the  word  size 
on  the  computer  is  relatively  large  and  the  computer  contains  floating 
point  arithmetic.  Consider  the  CPC  6600  computer  word  for  example.  The 
characteristic  and  sign  are  in  1'3  bits  whereas  47  oits  are  used  to 
represent  the  mantissa. 

Making  use  of  the  CDC  normalized  floating 
arithmetic  with  shift  and  mask  instructions,  two  words  with  six  signifi- 
cant digits  of  accuracy  may  be  packed  into  a single  word.  The  compres- 
sion ratio  is  2:1. 


Word  1 Word  2 


30  bits  f 30  bits 


Packed  Word 

An  advantage  of  this  scheme  is  that  the  da. a is  already  ir.  acceptable 
floa  ing  point  representation  and  does  not  need  <.  «^-parate  cable  to 
retain  scale  factors. 

t.2.2.2  FIXED-LENGTH  H7.NIMUK  BIT 

A simple  example  ,'f  a fixed-length  Minimum  bit 
scheme  wcv'ld  be  the  use  of  a single  bit  to  represent  fire  signal;  zero  » 
no  firt,  one  ■*  fire.  For  a CDC  6600  computer  word  the  savings  is  60:1. 

Generally,  however,  the  data  cannot  be  represented 
by  a sing.e  bit  but  in  a&ny  cases  there  is  a minimus  number  of  bits 
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which  can  be  used  to  represent  the  full  dynamic  range  of  data.  (If  the 
dynamic  range  is  relatively  large,  an  alternate  scheme  such  as  exponential 
packing,  variable  length  minimum  or  a table  of  external  scale  factors 
would  be  desirable.) 

Consider  attitude,  for  example,  with  a full  dy- 
namic range  0-360  degrees.  If  .1  degree  resolution  with  .05  degree  accu- 
racy is  sufficient  the  dynamic  range  would  be  0-3600  with  a scale  factor  of 

10  which  could  be  retained  in  12  bits.  The  storage  savings  wouxd  be  5:1 
:or  a CDC  6600  computer  word.  The  technique  to  pack  words  is  simply  to 
multiply  the  original  word  by  10,  round,  integerize  and  pack  using  shift 
<jnd  mask  instruction.  To  unpack,  simply  mask,  shift,  and  divide  by  10. 

The  technique  does  net  make  full  use  of  the 
•:orags  capability  of  12  bits.  If  the  scaling  factor  were  change  to 
»095/360,  an  accuracv  of  .QUIU  instead  of  .05  could  be  realized. 

An  alternate  version  of  a f ixed- length,  minimum- 
hit  scheme  would  be  to  retain  a table  of  scale  factors  with  sufficient 
additional  bits  allocated  to  each  word  for  pointing  to  the  correct  entry 
.n  the  table.  This  scheme  allows  for  a broad  dynamic  range  of  a given 
variable. 

6.  2.2. 3 VARiABhE-LENCm  MINIMUM  BIT 

An  alternate  form  or  the  mioisu.3  bit  scheme  is 
to  use  a variable  number  of  bits  to  represent  a parameter  with  a broad 
dynamic  range.  A truly  variable  scheme  would  require  an  external  table 
with  entries  pointing  to  the  number  of  bits  used  to  represent  a parameter 
at  a given  time.  There  is  the  additional  need  for  a pointer  to  point  to 
the  correct  entry  that  must  be  retained  with  each  sample. 


n modified  version  of  the  variable  bit  scheme 
would  be  to  divide  the  dynamic  range  into  bands  with  a given  number  of 
bits  allocated  tor  each  band.  A pointer  Is  retained  with  eacn  word  that 
would  point  to  the  correct  band  with  an  inherent  number  of  bits. 
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Consider  a variable  with  a dynamic  range  of  0- 
50000  with  unity  accuracy  and  resolution  requirements.  The  data  fell 
between  0-200  ninety  percent  of  the  time  and  was  greater  than  200  only 
ten  percent  of  the  time.  If  two  bands  were  allocated  containing  8 bits 
and  16  bits  respectively  with  a single  bit  to  point  to  the  correct  band, 
the  savings  over  a fixed  length  minimum  bit  scheme  would  be 

16: (.9X(8)+. 1x16+1)  - 16:9.8 
For  a CDC  6600  computer  word,  the  compression  ratio  is 

60:9.8  6:1 

Additional  computer  cost  is  involved  to  obtain  the  correct  number  of 
bits  for  shifting. 

6.3  PRESENTATION 

One  of  the  most  important  and  sometimes  least  emphasized  areas  of 
data  reduction  is  data  presentation.  Often  a simple  change  of  an 
output  format  can  mean  saving  many  manhours  in  data  analysis.  Appropriate 
selection  of  numerical  and  graphical  presentations  can  sometimes  mean 
the  difference  between  an  accurate  analysis  or  one  that  is  biased  by  the 
analyst  simply  because  he  was  not  able  to  observe  unexpected  relationships 
or  detect  system  errors. 

6.3.1  NUMERICAL  PRESENTATION 

Numerical  presentation  implies  presenting  the  data  in  a 
numerical  format  whether  it  be  a simple  printout  of  data  or  more  sophis- 
ticated schemes  of  using  numerics  (or  symbols)  to  represent  various 
conditions  or  levels.  Examples  of  such  are  digital  pictures  or  number 
graphs. 
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6.3. 1.1  THE  TIME-HISTORY  LIST 

When  presenting  data  in  a time  sequential 
format,  a column  presentation  is  usually  preferred.  Every  effort  should 
be  made  to  output  only  a single  parameter  for  a given  field.  This 
allows  the  user  to  scan  a column  and  observe  trends  without  having  to 
search  for  the  parameters  in  a maze  of  printouts. 

Often  the  number  of  available  print  columns  for 
a given  listing  is  not  sufficient  for  presenting  all  the  desired  parameters. 
In  these  Instances,  additional  listings  should  be  generated,  usually 
with  time  on  each  listing  for  easy  correlation.  The  simplest  method 
involves  generating  the  additional  listings  on  separate  files  with 
disposition  to  a printing  device. 

If  additional  files  are  not  available  due  to 
program  limitations,  the  data  can  be  written  to  a single  file  with 
appropriate  code  numbers  to  indicate  separate  listing.  Before  printing, 
the  file  can  be  sorted  and  printed  by  code  number. 

6.3. 1.2  REDUCING  PRINTOUT 


As  a general  rule,  a printout  of  every  sample 
is  neither  desired  nor  needed.  Selected  samples  that  show  significant 
levels,  changes  or  samples  at  significant  events  are  favored. 

When  the  requirement  is  for  data  only  during 
and  after  significant  changes,  the  programming  is  easy  to  implement. 
When  data  prior  to  significant  events  is  desired,  the  implementation  is 
not  as  easy.  If  sufficient  core  storage  is  available,  a rotary  buffer 
could  be  maintained  with  sufficient  past  history  retained  to  print  the 
required  data  prior  to  events.  The  modular  function  available  on  most 
compilers  is  an  excellent  tool  for  retaining  the  current  address  in  the 
rotary  buffer.  (Similar  rotary  schemes  are  often  used  when  doing  mid- 
point smoothing  and  editing  of  data.) 
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6. 3. 1.3  FIELD  REDUCTION 


Often  the  number  of  columns  needed  to  represent 
a given  parameter  can  be  reduced  by  scaling  techniques  discussed  previous! 
Other  methods  include  printing  tha  data  in  integer  format  with  implied 
decimal.  If  a parameter,  such  as  vise,  has  columns  that  change  infre- 
quently (e.g.,  hours,  minutes),  these  can  be  written  at  the  top  of  aach 
listing  with  less  print  colusns  assigned  to  the  parameter  itself.  In 
any  case,  the  columns  assigned  to  a given  parameter  should  remain  constant 
to  avoid  confusion.  It  is  not  unusual  for  thirty  or  more  parameters  to 
be  listed  on  a single  page  in  column  format  with  proper  scaling  and 
techniques. 


6.3. 1.4  MATRIX  PRESENTATION 


When  data  is  of  a matrix  nature  such  as  pictures, 
ceil  structures,  etc.,  effort  should  be  made  to  present  the  data  in  a 
matrix  format.  If  all  required  columns  (or  rows)  of  the  data  cannot  be 
displayed  on  a single  line,  additional  listings  should  be  generated  such 
that  the  listings  could  be  viewed  together  to  observe  the  data  in  aatrix 
format.  If  irrelevant  data  is  contained  in  the  aatrix,  these  values 
should  be  set  to  blank  for  printout  purposes. 


o.  i.  , 


GRAPHICAL  PRESENTATIONS 


Graphical  presentations  have  advantages  over  numerical 
presentations  in  that  such  mure  data  can  be  presented  in  an  easily 
assimilated  manner.  A disadvantage  is  that  sore  computer  time  is  needed 
and  additional  and  sometimes  complex  aatheitatics  oust  be  programmed  to 
construct  visual  pictures. 

There  are  numerous  texts,  papers,  and  articles  devoted  to 
all  phases  of  computer  graphics;  from  simple  graphs  to  complex  3-D  color 
movies  and  holograms.  This  discussion  will  mention  the  advantages  of  a 
few  basic  types  of  graphic  presentations  with  simplified  algorithms  for 
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producing  more  complex  plots  such  as  3-D  and  surface  plots  with  hidden 
line  removal. 


6 . 3 . 2 . 1 RECTANGULAR  PLOT 


The  rectangular  plot  is  probably  the  simplest 
and  most  used  of  all  types  of  plots.  It  simply  involves  plotting  a 
dependent  variable  or  variable  on  a vertical  scale  as  a function  of  an 
independent  variable  on  a horizontal  scale.  Uses  include  quick-look 
editing,  observing  trends  and  functional  relationships. 

6. 3. 2. 2  POLAR  PLOT 


The  polar  plot  is  useful  for  pictorially  repre- 
senting the  function  g = f(G).  To  plot,  the  function  should  be  mapped 
into  rectangular  plot  coordinates  (U,V)  by  the  following: 


U = g sin  (0) 

V = g cos  (0)  (6.3-1) 

Useful  examples  of  polar  plots  include  vulnerability  envelopes,  antenna 
patterns,  and  radiation  patterns. 

6. 3. 2. 3  HISTOGRAM 


Histogram  plots  are  used  in  determining  the 
distribution  of  a given  set  of  data.  They  are  often  used  in  conjunction 
with  and  in  lieu  of  statistical  measurements.  A goodness  of  fit  can 
often  be  inferred  by  a simple  histogram. 

6. 3. 2. 4  TIME-HISTORY  PLOT 

The  time-history  plot  is  used  for  observing 
data  trends  or  drifts,  noise,  biases,  anamolies,  timing  problems  and 
interrelationships  between  variables.  The  most  uncomplicated  time 


102 


l 


history  is  a simple  rectangular  plot  with  time  the  independent  variable 
and  the  test  item  the  dependent  variable. 

As  a general  rule,  however,  the  time  of  interst 
is  of  such  magnitude  as  to  preclude  putting  it  on  a single  frame  (plot). 

To  achieve  the  continuous  format,  several  frames  must  be  abutted,  some- 
times requiring  complex  programming. 

6. 3. 2. 5 THE  3-D  PLOT 

The  3-D  type  plot  is  a plot  whereby  relationships 
in  width,  aepth,  and  height  may  be  observed  in  a single  picture.  An 
extension  to  this  concept  may  be  a family  of  functions  displayed  in  some 
increment  of  a changing  dependent  variable. 

There  are  many  methods  of  constructing  a 3-D 
picture  using  various  gray-scale  techniques,  color  schemes,  and  geometries. 

This  discussion  will  present  three  geometric  methods  for  determining  a 
given  point  represented  by  three  coordinates  (X,Y,Z)  on  a plotting  plane 
in  a 3-dimensional  framework.  A line  can  be  drawn  by  determining  the 
location  of  its  two  end  points. 

6 . 3 . 2 . 5 . 1 OBLIQUE  METHOD 

The  oblique  picture  is  one  in  which  two  of 
the  axes  are  always  at  right  angles  to  each  other,  being  in  a plane 
parallel  to  the  image  plane  with  the  third  or  "depth"  axis  being  at  any 
angle  (except  90  degrees)  to  the  vertical  (60  degrees  or  45  degrees 
being  generally  used).  The  location  of  the  point  (X,Y,Z)  can  be  found 
by  going  along  one  axis  at  a distance  equal  to  the  corresponding 
coordinate  and  then  parallel  to  each  of  the  other  axes  at  distances 

equal  to  the  corresponding  coordinates.  This  can  also  be  done  mathemat-  . 

ically  by  finding  the  horizontal  and  vertical  distances  from  the  point 
of  the  origin  in  the  image  plane  to  the  point  in  question  in  the  image 
plane. 
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As  an  example,  consider  the  coordinate 
system  drawn  as  indicated  in  Figure  6-1. 


00 

•H 

<L* 

C 


c 


Picture  Plane  Axis 


-Y 


Figure  6-1 


If  a is  taken  as  the  positive  angle  between  the  positive  X axis  and  the 
positive  Z axis  and  (u,v)  are  the  horizontal  and  vertical  coordinates  of 
the  point  (X,Y,Z)  in  the  image  plane  relative  to  the  picture  origin. 


u = XSin(a)-Y 
v XCos(a)+Z 


(6.3-2) 


Although  the  oblique  method  is  a relatively  simple  means  of  dipicting 
3-D,  a certain  amount  of  distortion  may  exist  if  angle  a is  not  * i . ’ ly 
chosen. 
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6. 3. 2. 5. 2 


ROTATION  MATRIX 


The  following  matrix  is  useful  for  rotating 
the  point  (X,Y,Z)  through  angles  0,  <p  (attitude  angles  of  a viewer) 
for  projection  to  a plane  normal  to  a line  of  sight.  The  angles  are 
defined  in  reference  to  the  coordinate  system  depicted  in  Figure  6-2. 
Positive  rotation  is  clockwise  looking  out  the  axis  of  rotation. 


cyco 

SY  CO 

-SO 

Cfses^-SYC^. 

SYSGS^+CYC* 

C0S<j> 

(6.3-3) 

CYSOC^+SYS^ 

S4'S0C$-C'i'S<J> 

COS* 

C indicates 
S indicates 

Cosine  function 
Sine  function 

Figure  6-2 


6.3.2. 5. 3 


AXONOHETRIC  METHOD 


The  axonoaetric  method  is  theoretically  an 
orthographic  projection  (parallel  projection  to  the  view  plane)  of  an 
object  to  the  image  plane;  the  object  being  rotated  such  that  three 
faces  show.  If  i j/,  0 are  the  attitude  angles  of  the  viewer,  the 

coordinates  (u,v)  in  the  picture  plane  of  a given  point  (X,Y,Z)  can  be 
found  as  follows: 


u 

X 

= r 

Y 

1 V 

Z 

(6.3-4) 


Where  L is  a 2x3  matrix  defined  as  follows: 


r = q1 
i.J  i+l,J 


i * 1.2  J = 1,2,3  (6-3-5) 


In  the  axonometric  method,  the  picture  plane  does  not  need  to  be  fixed 
but  can  be  located  anywhere  along  the  line  of  sight  (LOS).  (See  Figure  6-3.) 


6. 3.2. 5. 4 


PERSPECTIVE  METHOD 


The  perspective  anhod  is  mien  like  the 
axonoaetric  method  in  that  a viewpoint  is  specified  and  the  object  is 
rotated  through  the  aspect  angles  of  the  viewer.  Instead  of  orthographic 
projection,  the  rotated  point  is  projected  to  a fixed  inage  plane  along 
a line  froa  the  point  in  question  to  the  viewpoint  (see  Figure  6-4). 


t 

( 
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Figure  6-4 

If  ip,  <p,  6 are  attitude  angles  of  the 
viewer  as  defined  previously,  then  the  coordinate  (u,v)  in  the  picture 
plane  can  be  found  as  follows: 


X 

X 

V 

A 

= n 

Y 

z 

Z 

R 


(6.3-6) 
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u = (Y  )(R  /X  ) 
R V R 


v = (Z  )(R  /X  ) 
R V R 


(6.3-7) 


Ry  is  the  normal  distance  from  the  picture  frame  to  the  viewpoint. 

In  equation  (6.3-6)  above,  the  coordinates 
Yr,  Zr)  represent  a point  defined  in  a system  where  the  origin  is 
at  the  viewpoint  with  the  positive  X axis  along  the  LOS,  the  positive  Y 
axis  to  the  left  and  the  positive  Z axis  up.  The  equations  for  (u,v)  in 
equation  (6.3-7}  res  lit  from  a similar  triangle  relationship  developed 
is.  basic  projective  geometry.  To  illustrate,  let  A be  the  plane  normal 
to  the  LOS  and  passing  through  the  point  (X,  Y,  Z)  and  let  B be  the 
image  plane  (see  Figure  6-5). 


f 

f 


V\  can  be  lMSlne<i  *s  a variable  scaling  factor  which 
scales  the  object  laage  „ a function  of  distance  fro.  the  viewer.  The 
-aropective  ~thod,  though  acre  cneples,  provide,  a picture  in  which  it 
is  easier  to  visualize  relative  distances. 
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6.3.3  COMPUTER  GENERATED  MOVIES 

An  area  of  computer  graphics  worthy  of  Mentioning  is  the 
use  of  the  computer  for  generation  of  a series  of  pictures  on  fils 
suitable  for  showing  with  a Movie  projector.  The  effect  is  an  animated 
sequence  approximating  the  dynamic  actions  of  the  objects  in  the  fixtures. 

The  steps  for  constructing  a simplified  movie  of  an 
object  described  by  line  structure  is  as  follows: 

1.  Advance  frame 

2.  Scale  frame 

3.  Rotate  all  objects  through  view  angles 

4.  Construct  object  on  frame  as  per  one  of  the  previously 
described  methods. 

5.  Advance  frame 

The  steps  are  deceptively  simple.  The  most  difficult  is  usually  the 
scaling  of  the  frame  and  objects  such  that  a realistic  picture  is  achieved. 

For  further  information,  the  reader  is  referred  to  the 
paper  "Constructing  2-D  Pictures  of  3-D  Objects  With  A Digital  Cor.puter” 
(ref  6.  .),  and  Program  P1932,  "Generalized  Movie  Making  Program"  developed 
by  the  Directorate  of  Computer  Sciences,  ADTC,  Egiin  AFB,  FL. 

6.3.4  HIDDEN  LINE  ELIMINATOR 

Numerous  techniques  have  been  developed  for  the  removal 
of  hidden  lines  (lines  that  are  not  seen  when  viewing  an  object  or 
surface)  in  a picture  by  a computer.  As  a general  rule,  each  technique 
has  peculiar  applications.  The  reader  is  referred  to  the  bibliography 
for  references  on  the  various  techniques. 
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One  technique  that  is  fairly  easy  to  implement  and  has 
application  when  attempting  to  plot  a surface,  a matrix,  or  family  of 
functions  sake*  use  of  the  following  well  known  principle:  If  a surface 

can  be  described  by  a family  of  curves  and  the  curves  are  ordered  from 
the  foreground  to  the  background,  a curve  becomes  invisible  at  points  of 
intersection  with  curves  that  are  further  in  the  foreground.  These 
’>oints  of  intersect ion  may  be  easily  found  if  a "visibility”  curve  is 
established  in  the  2-dtmensional  plotting  system  consisting  of  the 
maximum  (positive  up)  vertical  plotting  unit  encountered  for  each  hori- 
zontal plotting  unit.  The  new  curve  tc  be  plotted  becomes  invisible  at 
all  points  where  the  vertical  units  of  the  new  curve  are  below  the 
corresponding  vertical  units  on  the  visibility  curve.  A new  visibility 
curve  is  established  each  time  a new  curve  Is  plotted. 


NOTE 

The  above  assumes  the  surface  does  not  become 
visible  from  the  underside.  If  the  surface 
is  visible  from  the  underside,  a "minimum" 
visibility  line  may  also  be  established 
consisting  of  the  minimum  vertical  plotting 
unit  encountered  for  each  horizontal  plotting 
unit.  The  curve  in  question  is  invisible  at 
all  points  ''here  the  corresponding  vertical 
units  are  below  the  maximum  and  above  the 
minimum  visibility  curves. 


Ill 


The  following  FORTRAN  subroutines  ny  be  used  to  construct  time- 
history  plots  and  oblique  3-D  surface  plots.  Use  is  node  of  an  SC4020 
Plotting  Package  which  contains  routines  for  constructing  lines,  scaling 
and  labeling.  If  the  user  does  not  have  access  to  tfce  SC4020  Plotting 
Package,  appropriate  routines  will  need  to  be  substituted.  The 
aigorith.,  however,  will  reoaln  the  aa»e.  Counts  within  each  routine 
def.ne  the  routine  function  and  interaction  with  other  routines. 


t 

I 
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SUBROUTINE  PL OT IT <T IME.NPGR, ISMB*DX* OY .OZI 
DIMENSION  OX(NPGR) ,0Y (NPGR) *OZ (NPGR)  « ISHBtNPGRI 
DIMENSION  CENTC3) 

DIMENSION  SCAL (3) 


DIMEN 


I61INE  1 10 ) • l'TLINE (101 


cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc 


THIS  IS  A GENERAL  PURPOSE  ROUTINE  FOR  PLOTTING  1-3  GRAPHS  ON 
A CONTINUOUS  ABUTTED  FRAME  OUTPUT. 

IT  USES  AN  SC-4020  PLOTTING  PACKAGE  TO  CONSTRUCT  LIKES  ANO  LABEL 


♦**  CONTROL  VARIABLES 

TIME  HORIZONTAL  VARIABLE 

NPGR  = NO.  OF  POINTS  TO  BE  PLOTTED  ON  A SINGLE  GRAPH. 

ISMB=  ARRAY  CONTAINING  INTEGERS  TO  SELECT  PLOTTING  SYMBOLS 
OX=V£RTICAl  VARIA0LE  ARRAY  FOR  TOP  GRAPH 

OY=  VERTICAL  VARIABLE  ARRAY  FOR  MIOOLE  GRAPH.  (IF  REQUESTED'. 
OZ-VERTICAL  VARIABLE  ARRAY  FOR  BOTTOM  GRAPH. C IF  REQUESTED) 
NOTE-  IF  VARIABLE  =-999999.  THEN  IT  IS  CONSIDERED 
ANO  IS  NOT  PLOTTED  OR  ANNOTATED. 


NGRAPH=  NO  OF  GRAPHS  TO  BE  CONSTRUCTED.  (MAX  3) 

DELTsTIME  RATE.  ITHIS  IS  USE''  FOR  DETERMINING  SCALE) 

ICALL  =1  IF  BEGINNING  OF  NEW  PLOT.  (NEW  FRAME  IS  STARTED  ANO 
LABELING  IS  PERFORMED.) 

= 0 IF  NOT  BEGINNING  OF  NEH  PLOT 
NPI=  NO  OF  POINTS  PER  INCH  TO  BE  PLOTTED.  THIS  IS  USEO  WITH 
OELT  TO  DETERMINE  SCALE.) 

RANGE  = ARRAY  CONTAINING  UPPER  ANO  LOWER  LIMITS  OF  DX.3Y.QZ. 

CtNi  = akkat  UuniaihjnG  itiiiaN  vfii.ut  ur  k«nG£S 
VERT  = BCD  ARRAY  CONTAINING  VERTICAL  LABELS. 

HOR  = BCD  ARRAY  CONT  AININT  3 HORIZONTAL  LABELS.  tl  AT  TOP,  l AT 
BOTTOM) 

TLABs  RATE  AT  WHICH  TIME  LABELING  AND  ANNOTATION  OF  CX,  CY,  CZ, 
IS  TO  BE  DONE. 

TMH3 K=  RATE  AT  WHICH  TIME  HACK  MARKS  ARE  TO  BE  INSERTED. 


***  LOCAL  VARIABLES 

NEWFRM= • TRUE.  INOTCATES  Tlhc  FOR  NEW  FRAME. 

LA0EL  = . TRUE.  INDICATES  TIME  FOR  A KNOT  A T I CN  OF  TIMC,  CX,  CY,  CZ . 
ILEP  T-  LEFT  STARTING  CASTOR  POSITION.  <90  FOR  FIRST  FRAME , ZERO 
OTHERWISE) 

XRAS T = GRAPH  HEIGHT  IN  RASTOR  UNITS. 

FHTIM  = TOTAL  TIME  FOR  A SINGLE  FRAME. 

TIMS  CL  = TIME  SCALE  FACTOR. 

SCAL  = ARRAY  CONTAINING  SCALE  FACTORS  FOR  OX,  OY#  OZ. 

timl=  t i he  corresponding  to  left  of  frahe< 

TIMRs  TIME  CORRESPONDING  TO  RIGHT  OF  FRAME. 


CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC 
COMMOM/L ABEL/ VERT (3) , HOR 16,3) 
COMMON/PLOT/NGRAPH,DELT,ICALL,NPI,RANGE  <2, 3).  TLAB, TMHCK 
EXTERNAL  TABL1V 
LOGICAL  NEWFRM, LABEL 
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IFCICALL  .£0.0) GO  TO  5 

XRAST  = ( 850- (NGRAPH-1) *5Q.)/NGRAPH 

ILY=9T5.-(XRAST/2.» *70. 

ILEFT  = 100 
FMTIM=CNPI*6.85I*0ELT 
TIMSCL=  1023./FHTIH 
CALL  3IGV 
CALL  FRAMEVC3) 

CALL  CHSIZV (2,2) 

CALL  RITSTVC13,i9,TA8LlV) 

CALL  RITE2V <10, 10 00,1023, 90,1, 56,1* HOP  Cl • 1), IER1) 
CALL  RITE2V(l3«47»lG23«90*lr56tlfHGRfl  ,2)  ,IER2I 
CALL  RITE2V<1Q,20,1023,90,1,56,  i,MOR t 1 , 3)  , IE R3I 
CAtL  IHMS(TIM£,IH,IMtSEC) 

I SEC  = SEC/30. 

ISEC  = ISEC  * 30 

XTIME  = IH  • 10000  ♦ IM  * 100  ♦ ISEC 
IXX  = ILEFT 

CALL  IABLVCXTIME,IXX,73,6,1.6) 

TT1  ss  IH  * 3600000  ♦ IH  * 60000  ♦ ISEC  • 1000 
TT2  = TIME 

IFIICALL.EQ.il  TIME  = TT1 
NEHFRMs.TRUE. 

TIMR  = 0. 

5 CONTINUE 

IFf TIME. GT.TIKR)NEWFRM=. TRUE. 

IF  (.tJOT.NEMFRMIGO  TC  100 
IF  I ICALL.EQ.OI  ILEFT  s (J 
IFCICALL.EG.OCALL  FRAMFVC3I 
IF(ICALL.EQ.0.ANO.TlME.GT.lTIMR*FMTIMniLEFT  = 200 
IV3GN1=935. 

DC  20  I*1,NGRAPH 
lYBGN2=IYBGM-XPASr/2, 

IY9GN3  = I YRGNt'-XRAST 
IFCICALL. EG.OIGO  TO  10 
SCALtI JxXRAST/ {RANGE (1 . 1 1 -RANGE (2, III 
CENT  C 1 1 = CRANGEd.il  ♦ RANGE  (2,111/2. 

CALL  APRNTVC0,-l«.fl0.VERT(n.6»tLYI 
ILY  = IL  Y-XRA5T-50. 

CALL  LABLVC PANGEC1 »I) ,1 8,1 Y9GN1 ,7 ,1, 61 
FVALUE  = RANGEfl.il  - ' RANGE  C 1 • II  - RANGE  C2,  I # I /2 1 

CALL  L AQL V CF  VA LUE  « 1 8 * I Y tlGN2  ,7 .1.61 
CALL  LA BLVC RANGE (2. II ,18, I Y3GN3, 7,1 , 6) 

CALL  LINE VC  I LEFT , I YBCN l , ILEFT , I Y BGN3 * 

10  CONTINUE 

CALL  LINEVCILEFT.IYBGNI,  102  3,  IY9GN1I 
CALL  LINEVC  ILEFT,  IYBGNH,  1023,  IYBGN 2) 

CALL  LlNEVCILEFTtlY BGN3 , 102 3, 1 YBGN3) 

YO*  L-XRAST/10 
DO  15  JK'1,4 

IBLINtf JKI=IYBGN2-Y0EL»JK 
ITLINECJK)  = IYBON2  ♦ YOEL4J< 

CALL  LINEV  (ILEFT, I3LTNE CJKI, 1023, I3LINECJKII 
15  CALL  LINEV  f IL EF T , I TL I NE C JKI , 1023, ITL INE  ( JKI I 
IY3GN1 =1 YBGNl-XRAST-50 • 
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20  CONTINUE 

TIMR=TIME*FHTIM*  Ci-ILEFT/1023.  I 

TIML^TIHE 

KEKFRM=. FALSE. 

COMPUTE  NEXT  LABEL  TIME 
TIMHCK=TIME  ♦ TMHCK-AMOO(TIME,TMHCK> 
XX  s TIMHCK 
IL  = I LEFT 

IF| ICALL.EQ.l)  IL=100 
25  IX  = IL  ♦ {XX-TIMD  • 1IMSCL 

TFfTX. £0.1023. CP .IX. EQ.1G22)  IX  = 1021 

IF! IX. GT. 1024)  GO  TO  35 

IYH  = 85  ♦ XRAST 

CALL  LINEV  I IX, 8S, IX, IYH) 

IFINGRAPM.cq.i j GO  TO  30 

IYTM  = IYM  « 50 

U H = IYTH  ♦ XRAST 

CALL-LINEV  CIX.IYTh,  IX.  IYH) 

IF( NGRAPm.EQ.2 ) GO  TO  30 
IYTH  = IYH  ♦ 50 
I YH  = I V TH  ♦ XRAST 
GALL  LINEV  ( IX,  IYTH,  IX,  IYH) 

30  CONTINUE 

X/  s xx  ♦ TMHCK 
GO  TO  ?5 
35  COITINUF 

95  T TM  * ftHsTT*«F*  TLAR-A’-OPI  TTMF.TLAB) 

IE( ICALL.EQ.l)  TIM?  = TT2 
10  0 CONTINUE 

LA3EL=. FALSE. 

14  5 CONTINUE 

Iff TI^E.L'.TI^LAB) GO  TO  150 
IF< TIMLAB.lT.TIVLJGO  TC  35 
call  IhmSITT-LA^.Ih, IM,SEC? 

XT!  ME  = IN  * 10  0 0 0 ♦ I*  • ICO  ♦ SEC 

lXX={TTMLAP-Tm)*T  IMSClMLEFT 

CALL  POINT V'  Ixx,85,?4,<) 

ixx  =IXX-2 0 

IFl  IXX.LT.3)  T < * = 8 

If  t UXX4j.9i.GT.  132  3)  I xx  = 3?5 

CAL  l L AOL V f X T I *£ , I X X , 7 3 , 6 , 1 ,5 1 

LABEL  - .TRUE. 

TIMLA3-T  TMLAH*  TL  A3 
GO  TO  1 s. 5 
150  CONTINUE 


IF(  UHf  .IT.TIMMCKIGO  TO  153 
IX=fT IMHC<-TI -L> *T I vSCL*ILEFT 
CALL  POlNTyl IX, 85, 5,1) 

IYw=«5  ♦ yRA~T 
IFINGRAPH.FO.I )GC  TO  151 
IYTH  = 85  ♦ yoAST  53 
CALL  POI NTy( lx, IYTH.S, 1) 

I YH  = I Y T M ♦ i ?A  5 T 
I F ( NO? APH . EG, 2 ) GO  TO  151 
IYTH  - IYTH  t XRAST  ♦ 50 


CALL  P0INTV<IX,IYTH,5,1J 
IYH=IYTHfXRAST 
151  CONTINUE 

TIMHCK=TIMHCK*TMHCK 
GO  TO  150 
153  CONTINUE 

IX=  1 TI  ME-TIML ) *TIMSCL«-IIEFT 
lY0=935.-XRAST/2. 

00  155  1= 1 ,NPGR 

IFC0X<I).GT.RANGEC1,1)  .OR.OX(H.lT.RANGEC2,m  GO  TO  1531 
IY  = COXCIl  - CENTCilt *SCAL C1I+  IYO 
CALL  POINTVClX,iY,ISH0CI)  »I DUN  ) 

CALL  POINTVCTX,lY,ISC'BCI), IOUM) 

1531  IF( NGRAPH.LT. 2>  GO  TC  155 
Ip(0Y<I).GT.PANGE(l»2).0R.0YCI).LT.RANGEC2.2))  GO  TO  1532 
IY  = C0Y<n-CENTC2n  *SCAL(2)MIY0-XRAST-5Q.) 

CALL  POINTVCIX,IY,ISR9CI>,IOUM) 

CALL  P0INTV(IX»IY»ISM9(IV .IDUM) 

1532  IFCNGRAPH.LT. 3)  GO  TC  155 

IFCOZC I) .GT.RANGEC1. 3) .OR.OZ  C I) .LT.  R ANGE 1 2,3 ) ) GO  TO  155 
IY  = COZCI)  - CENT<3)>*SCALC3) *1 YO-2 . * XR AST  - 100. 

CALL  POINTVCIX.lY.ISMeCIJ ,IOUM) 

CALL  POlNTVCIX.IY.IS'CEm.IOUM) 

155  CONTINUE 
IY  0=9  35 

1M  *ICUI  #LHUCLHjU  iu  c\sj 

IXX=IX-20 
IFC IXX.LT.8) IXX=8 
I^CIXX+AS.GT. 1023) TXX=975 
CALL  POINTV CIX,IYQ,-24,2) 

DO  160  1=1. NPGR 
111=1*12 

IFC  OX(I) • NE. -999999.  1 CALL  L ABLV C OX  Cl > , I XX , IYO ♦III ,6 , 1, 3) 
160  CONTINUE 

IFCNGRAPH.LT. 2)GO  TO  205 
IY1=IY0-XRAST-5O. 

CALI  P0INTVCIX.IY1.-2A.2) 

00  170  1=1, NPGR 
111=1*12 

IFCOYCIi . NE. -999999. ) CALL  LA9LVC0YCI) ,1 XX, I Y1 +111*6,1,3) 

170  CONTINUE 

IFCNGRAPH.LT. 3)GO  TO  205 
I Y2=IY1-XR AST-50. 

CALL  P0iNTVClX,IY2,-24,2) 

DO  180  1=1, NPGR 
II 1=1*12 

IFC 07  Cl) .NE. -9  99999. )CALL  LA8LVCDZCI) , IXX, I Y2 ♦ III, & , 1 , 3) 
180  CONTINUE 
20  5 CONTINUE 
RETURN 
END 
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SUBROUTINE  PLTMTXC A,NRA.NCAtIVIEW,IOPT , ICR0SS,X1,X2,Y1,Y2) 
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCoCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC. 


THIS  IS  A GENERAL  PURPOSE  ROUTINE  FOR  MAKING  AN  OBLIQUE  PLOT  OF 
A MATRIX  OF  VALUES.  IT  USES  PLT301  TO  ACTUALLY  PLOT  THE  DATA. 


C 


A = INPUT  APRAY  TO  OE  PLOTTED. 

NRA=NUMBER  OF  ROWS  IN  A 
NCA  = NUMBER  CF  COLUMNS  IN  A 
IVIEW  INDICATES  VIEWING  ANGLE 

= 1,  INDICATES  VIEW  FROM  ROTTOM  TO  TOP 

= 2,  INDICATES  VIEW  FROM  RIGHT  TO  LEFT 

* 3,  INDICATES  VIEW  FROM  TOP  TO  ROTTOM 

= A,  INDICATES  VIEW  FROM  LEFT  TO  RIGHT 

IOPT  = 0,  INDICATES  USE  DEFAULT  SCALING  FOR  VERTICAL  SCALE 
IOPT  = i»  INDICATES  VERTICAL  SCALE  WILL  BE  SET  BY  PROGRAMMER  (WTOP) 
INTERVAL  AT  WHICH  POINTS  BETWEEN  CURVES  WILL  BE  CONNECTEC  (USUALLY  II 
XI  = VARIABLE  ASSIGNEC  TO  LAST  ROW 
X2=  VARIABLE  ASSIGNED  TC  FIRST  ROW 
Yl=  VARIABLE  ASSIGNEC  TO  FIRST  COLUMN 


Y2=  VARIABLE  ASSIGNEC  TO  LAST  COLUMN 

NOTE-  IF  00  NCT  WISH  TO  ASSIGN  VARIABLES,  THEN 
SET  XI -NR  A , X2=?,  Yl=l,  Y2=NCA. 


cccccccccccccccccccccceccccccccccececccccccccccccccccccccccccccccccccccccccc 

DIMENSION  A(NRAtNCA) 

COMMCN/MAXVAL/UFOR,UPACKtVLEFT,VRTGHT,WBOT,WTOP, 

.THETAt'JAXIS,  VAXIS.UVPLNE  t IU(2,B)  ,lv'(?  ,2  } ,IWC2 ,2),JLA8 
DIMENSION  V(150> ,W (ISO) 

PRINT  qQQa,NRA,NCA, IVIEWtIOPTtICROSStXIt  X2,Y1,Y2 
9000  F0RMAT(lXt5I5,4F13. 41 

PRINT  9Q01t((A(I,J) ,J=1,NCA),I=1,NRA> 

30  0 i FORMAT ( 1X,13F13. 4) 

ZMIN  = 9999999. 

DX= (X2-Xl)/( NRA-1) 

DY= (Y2-Y1)/(NCA-1) 

JLAB=0 

ZMAX  = -999999. 

DO  15  1=1, NRA 
DO  15  J=1,NCA 
ZMAX=AMAXl(ZuAX,A(ItJI ) 

ZMTN=AMIN1(ZMIN,A(I, J)  I 
15  CONTINUE 

IF (I  VI EW.GT.4. OR. IVIEW.LT.1) IVIEW=1 
IF C IOPT. EQ.1IG0  TO  3 
WT0P=2.*ZMAX 
W80  T = T MIN 


THETA”-,  7 8 54 
3 CONTINUE 

UVPLNE=WB0T 

IF( ZMIN. LE.O .AND. ZMAX. GE.0)UVPLNE=0.  0 
CL IPU= 99 999999. 

CLI PO? -999999999. 
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11 


12 


13 


14 

5 

10 


21 

22 

23 

24 


XNOIS£  = 0.0 

UFOR=X2n*i?*13,1M»IVIEH 
UBACK=X1 
D£LU=-OX 
OElV=DY 
VIE  FT  = y t 
V«IGHT=Y2 
GO  TO  5 
UFOR=Y2 
UBACK=Yt 
QElU=~DY 
DElV=-DX 
VL£FT  = X2 
VRIGHT=xi 
GO  TO  5 
UF0R=X1 
U9ACK=X? 

DELU^OX 
OE  LV--DY 
VLEFT=Y2 
VRIGHTsyj 
GO  TO  5 
UFOR= Y 1 
U9ACK=  Y2 
DEL  U=  OY 
DEL  V=  OX 
VLEFT^X! 

VRIGHT^X? 

CONTINUE 

yjSHftaiK1 

H(l!“wnOT’150 

Vcyr°^(r"2*  > ¥DELU 
nn{ Ilf Q» 11  U=UFOR 
DO  590  J = 1 f N p T S' 

V ( J J=VLEFT«-(J-l»}*QFt.V 
fir/r*r3*NPTSlV(J)=VfJ-l) 

Ta  5’ 

OJ“  J 

GO  TO  2 5 
JJ  =NC A- { T -? » 
rr= nra- { j-i ) 

GO  TO  25 

r r=  r-i 

JJ=NCA-{J-l) 

GO  TO  25 

JJ  = I-1 

It=J 


< 
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25  CONTINUE 

PRINT  OOOA,NCURV*NPTS*t,J,II,JJ  ,I¥XE« 

SOGi t FORMAT! 1X,^3I5S 

563  CONTINUE 
WC.M-A 

IF (W'J»  .GT.CLIPU1WI J»=CLIPU 
IF  (Hi  J>  ,LT.CLlPD>WtJ) =CLIP0 
IF ( A3S (H  IJS) .LT.XNOISE  »W (JJ=KNOISE 
590  CONTINUE 

I PA SS-ICR0S3 
IF(I.EQ.t)  IPASS--1 

CALL  PLT30l(U,V,W, IPASS, NPTS) 

700  CONTINUE 
999  CONTINUE 

DEL  Ms  (HT0P-H80T) /13. 

P'.V*  CNPT5-i*/l5+i 
) LU=tNCURv'-  i)  / 2 0 ♦ 3. 

CA(  L PLT3DttUBAC<,0ELW,0«0,-6,l) 

CALu  r-i  T3C1(U3ACK.OELV,0.0,-7?NLV> 

CALL  PLT3Ol(U8ACK,0ELU,0,0,-8,NLU> 

RETURN 
END 

SUBROUTINE  PLT3Q1(UA,VA,WA.|PASS,NPT) 

COHHON/CLASF/TCLASS, TITLE (7» 

****  THIS  SUBROUTINE  WILL  MALE  03LIGUE  PLOT  OF  V VS  W FUNCTION  FCR 
A GIVEN  U. 

U=  VALUE  OF  INDEPENDENT  VARIABLE  U FOR  A GIVEN  V VS  W FUNCTION. 
v«  = CUE  DIMENSIONAL  ARRAY  CONTAINING  INDEPENDENT  VARIABLE  V. 

W A=DNE  DIMENSIONAL  ARRAY  CONTAINING  DEPENDENT  VARIABLE  W. 

W A ( I ) = F(U,VA(I)) 

NPTS=  NO.  OF  PTS  IN  VA,  WA  ARRAYS. 

NOTE  1 NPTS=IABS(NPT) 

IF <NPT„  LT01  THEN  WA (I >=F< UA ( I) ,VAI 1 1? 

TPASS=-i  IF  THIS  TS  FIRST  CALL  FOR  NEW  FRAME . 

IPASS  - -6  DRAW  LINES  FDR  W SCALE  ANi'  LABEL  V.  SCALE 
VAU)  IS  DELTA  » FOR  LABELING 

NPTS  = LABEL  EVFRY  NPTS  LINE 
IPASS  = -7  ORAW  LINES  FOR  V SCALE  AND  LAUEL  V SCA.E 
V Ml)  IS  DEI  TA  V FOR  LABELING 
NPTS  = LABEL  EVERY  NPTS  LINE 
IPASS  - -8  LABEL  *J  SCALE 

VA<1)  IS  DELTA  U FOR  LABELING 
NPTS  = LABEL  EVERY  NPTS  LINE 
IPASS  = -2  HIOOEN  Llr.ES  ARE  ELIMINATED, 

NOT  CONNFCTEO  T:  cVIOUS  LINE 

JLAB  IS  ALPHA  LABEL  FOR  US  _F 

JLAB  = 0 NOT  USE.  FOR  THIS  CALL 

IPA$S=N  INDICATES  EVERY  NTH  POINT  UN  THIS  CURVE  TO  NTH  POINT  OF 
PREVIOUS  CURVE  BE  CONNECTED. 
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COMMON / H A x V A L /UFC R « U 6 2 CK , VLE F T , VRIG  K T , W f» 0 T TOP, THETA* 

» UAXlSf /AXIS.UVPLNC  , IU  (2,  2i  1 1 V (2 .4  ' , I H< 2,2) 

♦ * JL A3 

C 

C HAXVAL  COMMON  IS  USED  FOR  DETERGING  SCALES* 
c U?UR*EXT*£M£  VALUE  OP  U FOR  FOREGROUND. 

C UGAGKs  EXTREME  VALUE  CF  U FOR  BACKGROUND* 

C VLEFT*  LEFT  MOST  VALUE  CF  /. 

C VkICHT=  RIGHT  MCST  value  f - V 

C MBOTsMIN  VALUE  C-  N. 

C vnQR-MAX  VALUE  0-  W 

C T / -:£  f A=  ANGLE  OF  C9LIQs'L  AXIS  HTTH  VERTICAL  (RADIANS).  NOTE-  IF  THIS 

C IS  ZERO.  THEN  RLC*  „lLt.  BE  TWO  DIMENSIONAL  WITH  U AND  H AXES  CO- 

C INCIDENTS 

C UAXISs  VALUE  Cr  V AT  WHICH  UAX!S  IS  TO  3E  ORAN N. 

C VAXTS=  VAS Lc  OF  y AT  WHICH  V AXIS  15  TO  BE  DRAWN. 

C siVPLNE*  /ALUE  GF  W AT  WHICH  UAXIS  AN?  VA.XTS  INTERSECT • 

DIMENSION  iXSAVIfeOi) ,IYSAV(68i) 

01  ME  NS I ON  UA ( NFT » t V A ( N°T) « WA  (NPT) 

COMMON  MINY  (iQ3<*>  * M AX  Y ( 1024  I 
LOGICAL  TOUT 
NP  TS=  T AQS ( NPT) 

PRINT  *6(50 * U FOR* UBACK*  VLEFT, VRIGHT *H80T * WTOPt  IPASS,NPTS 
at,  6 0 FORMAT  (lX.6rlO. 4,215) 

C PRINT  1000,  *J.  (VA(JJ).WA(JJ)  ,JJ*1»NPTS> 

• :c  FORMAT  «lX,FlG.3/CHX,lOFl0.3n 
IFCIPASS.GE.OIGO  TG  100 
IF(IPASS.EQ.-2)G0  TC  100 
IF( ID&SS.LE.-6)  GO  TO  305 

IF  FIRST  PASS,  ADVANCE  FRAME,  SET  SCALES,  AND  INITIALIZE  MAX  A NO 
MIN  FUNCTIONS.  The  FIRST  CURVE  TO  BE  PLOTTED  HILL  8E  The  FARTHEST 
I N THE  FOREGROUND.  SUCCEEDING  VALUES  OF  U MUST  BE  EITHER 
A-  ENDING  OF  DESCENDING. 

CALL  FRAMEV*3) 

■:****  FIMO  SCALE  VALUES  TO  MAKE  VARIABLES  PROPORTIONAL 
ULNTHsARS  (UF GR-U6ACK ) 

VLN T H=  AH S ( VR  TG H T-  VL E F T ) 

WLNTH=ABS(WTDP-W90T) 

SCMINsAMTNi (ULNTH, VLKTH« WLNTH) 

SCALU=SCMTN/ULNTH 
SCALV=SCMIN/VLNTH 
SCALW= SC MIN/ WLNTH 

IF(IPASS.NE.-3IGO  TO  18 
SCALU=1, 

SCALV= 1 . 

SCALW= 1 . 

19  CONTINUE 

ULNTH  = ULNTH* SC ALU 
DIFX=ULNTH*SIN (THFTA) 

IF (V RIGHT. L T.VLEF  D 0IFX=-DIFK 
IM  THE  TA.GE.  0.  OJGG  TC  20 

XRIGMT=VRIGHT*SCAIV 

XLEFT=VLEFT*SCALV*DIFX 
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GO  TO  30 
CONTINUE 

xleft=vleft*scalv 

XRIGHT=VRIGHT*SCAIV*DIFX 

CONTINUE 

YT3P=lWT0P-W80T)*SCALH*UlNTH*C0SfTHETAI ♦M80T*SCAtN 

Y80T-W90T*SCALH 

YSCAl£=YTCO-Y80T 

“SC  A.LE  = A8S  f XPIGHT-XLEFT) 

v'ALt  XSCALV{XLEFT,XRTGHT,74,50) 

' AL L YSCALV(YBOT,YTOP,50,74> 

00  60  1=1,1024 
MINY(I1=1Q00 
HAr.YU)  = 0 

IFn.GT.500)GO  TO  60 
IXSAVCI)=0 
IYSAVd»=3 
CONTINUE 

L*  CONSTRUCT  THE  AXES. 

('/  AXIS) 

r-.r*  n.MOfriMvtc.hr^oi  ti 

OX=  OEtU*S  ifi  CThET  A) 

I0ELXM$r/XSCAL£)*9CG. 

dy=delu*co*:<tmcta> 
ioely=(oy/v:^:ale>  *900. 

IX1  = NXV(VLEFT>SCALV)  dOELX 
IY1  = NYV(UVPLN£*S0UW)  UOEIY 
IX2  = NXV(VRIGHT*SCA<.V»  dCELX 
IY2=IY1 

CALL  UNEVdXl.  IYl*IX2ttVC) 

IW(1,1)=IX1 
IV(2,1)=IY1 
IV (1,?) 

IV<2.2)=JY2 

( W AXIS) 

1X1  = NX  V ( UA  XI S*  SC  AL  V)  ♦ICELX 
1X2  =1X1 

IY1  = NYV<  KnOT  *S  CALW) ♦ TOELY 
IY2=NYVtWT0P*SCALW) ♦IOELY 
CALL  LINEVlIXl , IY1, IX2.IY2) 

IW<1, 11=1X1 
IMf2,i)  = IYl 
IW(1,2) =1X2 
IW  (2,2)  = IY2 


(U  AXIS) 

IX1=NXV(UAXIS*SCALV) 
IY1=NYV£UVPLNE*SCALH) 
Y2=ULNTH*COS  C THETA) ♦UVPLNE’SCilW 
X2=0IFX»UAXIS¥SCALV 
IX?=NXV( X2i 
IY2=NYVC Y£> 
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r 


c 

100 


iiO 

120 

125 


130 


135 


CALL  l INEV(IX1,IY1,IX2,IY2) 

IU  (1,1>  = IX1 
IU( 2,1) =IYi 
IU  (1,21=1X2 
IU  <2,2)=IY2 

CONTINUE 
U=UA ( 1 ) 

IO'J  T = • TRUE  • 

ICOUNT  = I COUNT 
J=NPTS*1 

00  180  JJ=1,NPTS 
IF  (THET Al 110,120,120 
U=JJ 

GO  TO  125 
J = J-l 
CONTINUE 

IF CNPT.LT.0.)U=UA(J1 
IF (NPT.GT.O.ANC.JJ.GT.l)GO  TO  126 
DEL  U- AOS ( U-UF09) *SCALU 
DX=DELU*SIN!THETAI 
IOELX=(DX/XSCAL£l  *900. 

OY=UELU*CUS(  THfcMl 
IDE  LY= (DY/YSCALE)*900. 

CONTINUE 

IX2  = IXV(V£(J)*SCALVMIGEIX 
IY2  = TYV(WA(J)*SCALH»  *ICELY 
IFdPASS.LE.01G0  TO  170 
IF  ( *1CO(J-l,I  PASS)  • NE  . Q ) GO  TO  i?0 
NX=IABS(IX2-TXSAV(J) ) 

IF (NX . EQ. 0) NX=i 
INCX=i 

IFdXSAV(J) .GT. 1X2) INCX=-1 
QX=UX  ‘ 

DY  = TY2-IYSAVU> 

RAT  10=  UY/OX 
IYST=IYSAV( J) 

IX  = IXSAVU) 

IN  = 0 

DU  1 o5  1 = 1, NX 
IX  = IX*INCX 
IYs2Y5T*I*RATIO 

IFIIY.LE.HAXY(IX)  .AfiO.If.GE.NINY(IX)  )GO  TO  130 
IN=  0 

GO  TO  160 

IFdN.EO.l.OR.I.EG.DCO  TO  135 
IXCK=IXSAV  (J) 

IF(lY.GT,MAXYdX).ANC.  IYSAV(J)  .LT.HAXY  (IXCK)I  IYSAV  ( Jl  = NAXY  (IXCK) 
CALL  LINEV(IXSAV« J) , I YSAV C J) , IX ,IY) 

IXSAV(J)=IX 
IYSAV( J)=IY 
IN=  1 

CONTINUE 

MAXY(IX)=HAXO(IY,MAXY(!X)l 
MINYdX)  = HlNQ(IY,HINYdX)l 
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65  CONTINUE 

IF( IN. EQ. 0 ) CALL  LINE V < IXSAVt J) ,IYSAVIJ) *1X2, IY2) 

.70  CONTINUE 

IFUPASS.NE.-2)G0  TO  178 
IFCJJ.E0.1IG0  TO  175 
CALL  LINE VfIXl,IYl, 1X2, IY2) 

.75  CONTINUE 

IX1=IX2 
m=iY2 

IF ( JJ. N£« NPTS )G0  TO  180 
RETURN 

78  CONTINUE 

XXSAVC J)=IX2 
IYS AV< J)  -ViZ 
.80  CONTINUE 

IXI  = IXVCVA(1)*SCALV»  *ICELX 
IYi=UYV<KA{l)*SCALW) *ICELY 

IF( IY1.lt. MAXY(IX1»,ANC.IY1.GT.HINYIIX1»»  IOUT  r .FALSE 
DO  300  J=?,NPTS 
IX2=IXSAV< J) 

IY2 -IYSAV  < J) 

NX= 1 AHb  l 1X2-  1X1) 

IF! NX.EQ.0)NX=1 
I NCX  -1 

IFC  IX2'lT.IXU  INCX=-l 
DX=  NX 

0Y=IY2-IY1 
RAT IO=OY/OX 
IX=IX1 
IYST=IY1 
DO  230  K=1,NX 
IX=IX*INCX 
IY=IYST*K*RflTIO 
INDX=I X 

MXY=HAXY (INDX) 

KNY=MINY { INOX) 

IF  UY. LT.MXY.  ANO.  lY.GT.NNYlGO  TO  280 
IOUT=.TRUE. 

GO  TO  220 
CONTINUE 

IF(. NOT. IOUT)  GO  TO  210 
CALL  IINF.V(IX1,IY1,  IX, IY) 

CONTINUE 
IOUT=. FALSE. 

IXlalX 
I Y 1 = I Y 
CONTINUE 

MAXY  ( INOX)  = MAXO(IY,*'XYI 
HtNYUNOX)  -MlNOflY^hr; 

CONTINUE 

IF  ? .hOT . IOUT) GO  TO  23A 
CALL  LINEVUX1,IY1,IX2,TV2) 

CONTINUE 
INO 1=1X1 

NAXY(INOl)-NAX0tIYl,MAXYUNOi>) 


* 


30  0 

c**»* 


301 

305 


310 


315 


330 

340 


HINY(INDll=«lNO(IYlf»«INYCIMOlll 
1X1= 1X2 
I.Y1  = IY  2 
CONTINUE 

IFf  JLA9.EQ.O)GO  TO  301 
PRINT  HOLERTTH  JLAB 
I*L=IXVfVRIGHT*SCALVIdOELX»S 
IYL  - IOELY  ♦ IYV(MBCT*SCALM» 

CALL  PRINTVf8.JLA8.IXL.IYL) 

CALL  LlNE2VfIXL-l2,IYL-l,5,0) 

CALL  LlNE2VflXL-12,IYL-l,5,0) 

CONTINUE 

RETURN 

CONTINUE 

ILA  8 = 0 

IF{IPASS.E0.-8»  GO  TC  350 
DEL  U = A9SIUBACK-UF0K)  * SCALU 
OX  = OELU  • SIN  fTHETA) 

IDE  LX  = COX/XSCALE)  * 900. 

OY  = OELU  • COS  CTWETAI 
mri  v - t rv /vo*~m  c» 

IF f IPASS • EQ.-7 ) GO'TC' *30 
DELM  = VAtl) 

X0=AKAX1IABS(MT0P» .ABSIMBOTIJ 
NDL-NDMAXfXO) 

NOIG=MAXOf5,NCL) 

W - HBOT 

1X1  = NXV  f VLEFT  • SCALV) ♦ IOELX 
1X2  = NXVfVRIGHT  • SCALV) ♦ IOELX 
IY1  = NYVfW  * SCALW ) ♦ IOELY 
IY2  = I Y 1 

CALL  LINEVCIX1.IY1.IX2.IY2) 

IF f MOO (ILAB.NPTS).NE.O)  GO  TO  315 
CALL  LAeLV(W,IXl-52,m,NOIG  « l.NOL) 
CALL  LAQLVCW, 1X1-52, IY1.NOIG  .l.NOL) 
CONTINUE 
ILAB=  ILAB  ♦ l 
W = H ♦ OELH 
IF(H.GT.HTOP)  GO  TC  301 
GO  TO  310 

X0  = AMAX1 (ABSC VLEFT)  , AES tVRIGHT) ) 
N0L=N0MAXfX0) 

NOIG=MAXO (5.N0L) 

CONTINUE 
OEL  V = V A ( 1) 

V = VLEFT 

VR=VRIGHT».Q1*(VRIGHT-VLEFT) 

1X1  = NXVfV  * SCALV)  ♦ IOELX 

IY2  * NYVfHTCP  • SCAL4M  IOELY 

IY1  = N Y V C HBOT  * SCALN)  ♦ IOELf 

IX?  = T X 1 

1X3  = 1X1  - IOELX 

IY3  = I Y 1 - IOELY 

CALL  LlNEVfIXl.lYl.lX2.IY2) 

IFf MOOfILAB.NPTS  ).NE.Q)  GO  TO  345 
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345 


34  6 


350 


351 


355 


C 

C 


358 


CALL  LABLV(V, 1X2-28, IY2*6,NDIG 

CALL  LABLY(VvIX2-2S,IY2*6,NDIG  *1*?®!-! 

CALL  LA8LV(V, 1X3-28, IY3-6,N0IG  *i*NOLJ 

CALL  L ABLV ( V, 1X3-28, IY3-6,NOIG  ,l,HOLI 

CONTINUE 

ILAB  = ILA9  ♦ 1 

V-  V ♦ OELV 

If (VRIGHT.GT. VLEFTIGO  TO  346 
IFtV.LT.VR  )GO  TO  301 

GO  TO  340 

IFfV.GT.VR  1GO  TO  301 

GO  TO  340 

CONTINUE 

UB-UBACK+.01*  CUBACK-UFOR) 

ULA8  = UFOR 
CONTINUE 

OELAB  = AOS(ULAB-UFOP)  • SCALU 
OX  = OELAB  * SIN  (THETA) 

v.  m m * T n»*  Y • » 

U 1 — ULL4P  % V/UJ  I I ^L  I 

IOELX  = (OX/XSCALE)  * 900, 

IDE LY  = (DY/YSCALE)  • 9C0.  w 

1X1  s NXV ( VLEFT  * 5CALV)  ♦ IOELX 
1X2  = NXV(VRIGHT*SCALV»  ♦ IOELX 
IY1  = NYVCHBOT  * SCALH)  ♦ IOELY 
CONTINUE 

CALL  P0INTV(IX1,IY1, 6* ANYI 
CALL  P0INTV(IX2.IY1.6,ANY) 
IF(M00(ILA9,NPTS  ).NE.O)  GO  TO  358 
CALL  L ABLV (UL A B, 1X1-45 , I Yi , 5, 1 ,4 ) 

CALL  LA9LV(ULAe,IXi-45,IYl*5,l,4) 

ICH  = 4 

IF( ABS(ULAB) ,GT,9*>  ICH  = 5 
IF( ABS (ULAB)  .GT.99. ) I CH  = 6 
CALL  LA9LV(ULAB,IX2*4, IYl,ICH,l,ICH-2» 
CALL  LAQLV(ULA5, 1X2*4, IY1, ICH,1, ICH-21 
ILAR  = ILAR  * 1 


UL 


an 


UL  A 9 ♦ VAI1) 


IF(unACK,LT,UFOP)GC  TO  3441 
IF ( UL A H . GT . Ufl  ) GO  TO  301 

Gn  Tn  351 

-?c*4l  "IF  (ULAB.LT. UB  )GC  TO  301 
GO  TO  351 
END  ( 

jH  IS^UNCT' lON^RET'JPNS  THE  HAXIHUH  OCCIMAL  SCALE  NEEDED 
C FOR  USE  WITH  LAP.LV 

AX=A3G*x)*10.*.01 
DIG=AlQG10{AX) 

NO MAX =OIG 

IF(OIG.LT.O) NONAXsO 
IF  f NONAX.GT.6  ) NOMAX  =6 
ttr  turn 
END 

c«***  THIS^SUOROUTInPIrITES  CLASSIFICATION  AT  TOP  OF  PICTIRE  ANO  TITLE 
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^ COMMON^/ CLASP/  ICLASS, TITLE C8> 

EXTERNAL  TABL1V 

CALL  CHS IZ V f 4,4)  . ■ * - 

CALL  RlTSTV(23f33»TABLlVl 

ICALLLRITE2VUil5 iOOOt  i0|?*?2*|*i|**4*  tfurcKFIolNTIAL  InI) 
CALL  RITE2VM412,  20*  1023*99*  2*  12*- 1*  12HCCKF IDENTIAL  tNEI 

CALLORlfE2V(440«1000.i023*90*2*6*-1.6HS|CRET  ,N|I 

CALL  RTTE2VI440*  20 * 1023,90* 2*6*-l« bHSECRtT  *Nt» 

CONTINUE 

CALL  CHS 1 7V( 2*  2> 

CALL  RITSTV(13,18,TABL1V» 

CALLTRITE2V ( 10  0*989* 10  23*90 *2,70  ,1,T ITLE*HERI 

CONTINUE 

n-  » *«r»  %• 

* v/*v« 

END 


I 
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CHAPTER  7 
SUMMARY 
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Data  compression  and  maximization  of  information  contenc  has  become 
a technology  which  can  reduce:  (1)  computing  costs,  (2)  date  storage 
costs,  (3)  transfer  time/costs,  (4)  hardware  costs,  and  (5)  response 
(decision-making)  time.  It  is  realistic  to  expect  compression  ratios  in 
the  range  of  3:1  to  10:1  using  techniques  discussed  in  this  document. 

Since  any  large  data  base  problem  may  be  amenable  to  more  than  one 
compression/maximization  of  information  content  technique,  this  document 
categorizes  and  describes  individual  techniques  to  aid  the  user  in  a 
choice  for  his  application.  In  summarizing  techniques,  we  may  classify 
them  as  in  the  diagram.  Figure  7-1. 

7 . 1 REDUNDANT  DATA  REMOVAL  TECHNIQUES 

These  techniques  are  successful  if  sampling  rates  are  fixed  and 
generally  greater  than  the  usual  data  information  rates.  They  eliminate 
data  samples  that  can  be  implied  by  examination  of  preceding  or  succeeding 
samples;  or  by  comparison  with  arbitrary  reference  patterns. 

7.2  TRANSFORM  METHODS /LUMPED  PARAMETER  TECHNIQUES 

This  family  of  techniques  operates  on  data  samples  via  mathematical 
transformations  whereby  all  the  original  data  samples  are  irretrievably 
lost,  but  are  represented  by  parameters  in  a domain  other  than  time 
(such  as  frequency  or  sequency).  The  original  data  may  be  reconstructed 
within  some  error  tolerance  by  the  inverse  transformations. 


* 
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Preceding  page  blank 
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7.3  STATISTICAL  REPRESENTATION  TECHNIQUES 

As  in  the  case  of  transform  methods,  the  original  data  is  lost  and 
from  there  on  is  represented  by  other  parameters  such  as  statistical 
parameters,  coefficients  in  a math  model,  or  a smaller  sampling.  The 
original  data  may  not  be  reconstructed. 

7.4  OPTIMAL  ESTIMATION  TECHNIQUES 

The  objective  of  optimal  techniques  is  to  minimize  some  selected 
measure  of  e*ror  and  to  utilize  all  information  concerning  system  dynamics, 
noise  statistics,  and  initial  conditions.  An  optimal  technique  provides 
a performance  standard  for  comparison  and  evaluation  of  suboptimal 

approaches. 

7.5  MAXIMIZATION  OF  INFORMATION  CONTENT 

Provides  suggested  practical  techniques  for  reducing  the  sheer 
volume  of  data  when  trade-offs  in  accuracy  and  storage/retrieval  costs 
can  be  accepted.  Also,  included  are  suggestions  for  presenting  large 
amounts  of  data  to  the  user  via  a few  general  purpose  graphics  routines. 
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